I’m using Zip utility from the Info-Zip library to compress the tree of catalogs to get xlsx-file.
For that I’m using the next command:
zip -r -D res.xlsx source
source - contains the correct directory tree of the xlsx file.
But if you then look at the resulting file structure, the source directory will be included in the paths of all files and directories at the top level, and MS Office Excel will not be able to open this file. This is a well known problem. To avoid it zip.exe needs to be inside of the dest directory.
The problem is that I want to use the source code of this utility in my project, so this leads me to be unable to call my process, which will be responsible for compressing directories, to get xlsx files from these directories.
I’ve tried to find a place in the zip source code, where the parent catalog appending on the top-level happens. But seems
it is done implicitly.
Who can suggest how it can be done?
Just like Unix command tar -czf xxx.tgz xxx/, is there a method can do the same thing in HDFS? I have a folder in HDFS has over 100k small files, and want to download it to local file system as fast as possible. hadoop fs -get is too slowly, I know hadoop archive can output a har, but it seems cannot solve my problem.
From what I see here,
https://issues.apache.org/jira/browse/HADOOP-7519
it is not possible to perform tar operation using hadoop commands. This has been filed as an improvement as I mentioned above and not resolved/available yet to use.
Hope this answers your question.
Regarding your scenario - having 100k small files in HDFS is not a good practice. You can find a way to merge them all (may be by creating tables through Hive or Impala from this data) or move all the small files to a single folder in HDFS and use hadoop fs -copyToLocal <HDFS_FOLDER_PATH>; to get the whole folder to your local along with all the files in it.
I am working on iMX6 Arm processor based hardware platform with embedded linux. I am using the tar command to compress a directory containing 3 files. When I try to decompress the folder to get back the three files I am facing different issues:
I get one or two files after decompressing.
The data in the original file and the decompressed file, there is data mismatch.
I am using these commands to create and decompress the tar file:
nohup tar -zcpf /home/root/upload_new_U1/unit1-`(date +%Y%m%d_%H-00)`.tar.gz upload_now_U1
tar -zxpf unit1-`(date +%Y%m%d_%H-00)`.tar.gz
Please help.
I am on a mac and am trying to import a virtual machine image (.ova file). I try to import the file on a VM and get the following error.
Could not find a storage controller named 'SCSI Controller'
Any solutions out there that already exists for this problem.
I got a clue to the answer from here: https://ctors.net/2014/07/17/vmware_to_virtualbox
basically you need to change the virtual disk controller eg change ddb.adapterType from "buslogic" or "lsilogic" to "ide"
However if you don't have VMware to boot the original image and remove the vmware tools and remove the hard disk, you can hack the .ovf file in the .ova file to switch the virtual SCSI controller to an IDE controller.
here's how.
First open the ova archive, lets assume its in the current dir called vm.ova
mkdir ./temp
cd temp
tar -xvf ../vm.ova
This will extract 3 files, an *.ovf file, a virtual disk *.vmdk file, and a manifest .mf file.
edit the .ovf file, find the SCSI reference, it will be lsilogicsas or "buslogic" or "lsilogic". replace that word with ide.
While you are at it you may want to rename all the files so that they don't have spaces or strange chars in the name, this males it more UNIX friendly. Of course if you rename the files you need to modify the references in the .ovf and .mf files.
because you've modified the files the you need to recompute the sha1 values in the .mf file. eg run sha1sum to get the value and replace the old ones in the mf file.
$ sha1sum vm.ovf
4806ebc2630d9a1325ed555a396c00eadfc72248 vm.ovf
now that you've swapped the disk controller and fixed up the manifest's sha1 values you can pack the .ova back up. The files have to be in order inside the archive so do this (use your file names)
tar -cvf ../vm-new.ova ./vm.ovf
tar -rvf ../vm-new.ova ./vm.vmdk
tar -rvf ../vm-new.ova ./vm.mf
done. Now you can open Virtualbox and click File -> Import Appliance then point it at the vm-new.ova file. once done you should be able to start the vm.
hope that helps.
Cheers Karl
I run through a similar problem and I just extracted the.ova file and create new VM with my own settings using the .vmdk file.
tar -xvf vm.ova
vm.ovf
vm.vmdk
vm.mf
I was using VirtualBox on my PC(WIN 7)
I managed to View some files in my .VDI file..
How can I open or view the contents of my .vdi file and retrieve the files from there?
I had a corrupted VDI file (according to countless VDI-viewer programs I've used with cryptic errors like invalid handle, no file selected, please format disk) and I was not able to open the file, even with VirtualBox. I tried to convert it using the VirtualBox command line tools, with no success. I tried mounting it to a new virtual machine, tried mounting it with ImDisk, no dice. I read four Microsoft TechNet articles, downloaded their utilities and tried countless things; no success.
However, when I tried 7Zip (https://www.7-zip.org/download.html) I was able to view all of the files, and extract them selectively. Here's how I did it:
install 7zip (make sure that you also install the context-menu items, if prompted.)
right-click on the VDI file, select "Open Archive"
when the window appears, right click on the largest file in the archive (there should be two files, one is "Basic Microsoft Data Partition" and the other one something else, called system or something.) Left click on the largest one and click "Open inside". The file size is listed to the right of each file in bytes.
you should see all of the files inside of the archive. You can drag files that you'd like to extract right to your desktop. You can double click on folders to view inside them too.
If 7zip gives you a cryptic error after extracting the files, it means that you closed the folder's window that you are copying files to in Windows Explorer.
If you didn't close the window and you're still getting an error, try extracting each sub-folder individually. Also make sure that you have enough local hard drive space to copy the files to, even if you are copying them just to an external disk, as 7zip copies them first to your local disk. If the files are highly compressible, you might be able to get away with using NTFS compression for the AppData/temp folder so that when 7zip extracts the files locally, it'll compress them so that it can copy them over to your other disk.
You can mount partitions from .vdi images using qemu-nbd:
sudo apt install qemu-utils
sudo modprobe nbd
vdi="/path/to/your.vdi" # <<== Edit this
sudo qemu-nbd -c /dev/nbd0 "$vdi"
# view partitions and select the one you want to mount.
# Using parted here, but you can also use cfdisk, fdisk, etc.
sudo parted /dev/nbd0 print
part=nbd0p2 # <<== partition you want to mount
sudo mkdir /mnt/vdi
sudo mount /dev/$part /mnt/vdi
Some users seem to need to add a parameter to the modprobe command. I didn't with Ubuntu 16.04, but if it doesn't work for you, try adding max_part=16 :
sudo modprobe nbd max_part=16
When done:
sudo umount /dev/$part
sudo qemu-nbd --disconnect /dev/nbd0
Try out VMXray.
You can explore your vmdk image right inside your browser. Select the files that you want to extract and extract them to the desired location. Not just vmdk, you can use VMXRay for looking into and extracting files from RAW, QEMU/KVM QCOW2, Virtualbox VDI, and ISO images. ext2, ext3, FAT and NTFS are current supported file systems. You can also use this to recover deleted photos from raw dumps of your camera's SD card, for example.
And, do not worry, no data from your files is ever sent over the network. Data never leaves your machine. VMXRay works completely inside your browser.
As a first approach you can simply try any archive viewer to open .vdi file.
I tried 7zip to open Ubuntu Mate .vdi file and it shown all Linux file system like below.
An easy way is to attach the VDI as a second disk in another Virtual Machine.
The drive does not appear immediately; in Windows go to Disk Manager, bring the disk online and assign it a drive letter.
You can use ImDisk to mount VDI file as a local drive in Windows. Follow this virtualbox forum thread and become happy )) Also you can convert VDI to VHD and use default Windows Disk manager to mount VHD (described here)