I am trying to encode image using JBIG2 encoder that I have installed using Macports.
https://ports.macports.org/port/jbig2enc/
I have also installed leptonica from Macports:
https://ports.macports.org/port/leptonica/
The system seems to have installed it:
% jbig2 -V --version
jbig2enc 0.28
Also, from jbig2 --help I am getting this
% jbig2 --help
Usage: jbig2 [options] <input filenames...>
Options:
-b <basename>: output file root name when using symbol coding
-d --duplicate-line-removal: use TPGD in generic region coder
-p --pdf: produce PDF ready data
-s --symbol-mode: use text region, not generic coder
-t <threshold>: set classification threshold for symbol coder (def: 0.85)
-T <bw threshold>: set 1 bpp threshold (def: 188)
-r --refine: use refinement (requires -s: lossless)
-O <outfile>: dump thresholded image as PNG
-2: upsample 2x before thresholding
-4: upsample 4x before thresholding
-S: remove images from mixed input and save separately
-j --jpeg-output: write images from mixed input as JPEG
-a --auto-thresh: use automatic thresholding in symbol encoder
--no-hash: disables use of hash function for automatic thresholding
-V --version: version info
-v: be verbose
As the encoder refers to https://github.com/agl/jbig2enc for encoding the images I tried the command they have mentioned for encoding:
$ jbig2 -s feyn.tif >feyn.jb2
I ran it for an image original.jpg, This is what I am getting:
> jbig2 -s original.jpg >original.jb2
[1] 43894
zsh: command not found: gt
zsh: command not found: original.jb2
sahilsharma#Sahils-Air ~ % JBIG2 compression complete. pages:1 symbols:5 log2:3
?JB2
?|?n6?Q?6?(m?զu? Y???_?&??1???<?CJ?????#Rᮛ?O?V??:?,??i4?A?????5?;ސA??-!????5Ѧ??/=n܄?*?#|J6#?J?6?N1?n??v?"E}?.~?+????ڜ?]HO_b??~?[??????S2p𩗩????fC?????X?Z?????X=?m?????
??jN?????i????S?,j6???Br?V??F???8?w?#?6? uK?V??R?s~F-?F%?j????]j???0?!GG"'?!??)2v??K???h-???1
[1] + done jbig2 -s original.jpg
According to '--help', '-s' will do the lossless encoding.
The execution shows JBIG2 compression completed but no jb2 files have been formed.
Please help me in getting to know if the compression has taken place? Then where can I get the encoded image?
I am running this encoder to get to know the compression ratio. So I just want to know the encoded image size.
Use >, not >. The result will then be in feyn.jb2.
I don’t understand the instructions given here and here.
Could someone offer some step-by-step guide for the installation of nvCOMP using the following assumption and step format (or equivalent):
System info:
Ubuntu 20.04
RTX-3060
NVIDIA driver 470.82.01
CUDA 11.4
GCC 9.4.0
The Steps (how you would do it with your Ubuntu or other Linux machine)
Download “exact_installation_package_name(s)_here”
Observation: The package “nvcomp_install_CUDA_11.x.tgz” from NVIDIA has the exact structure as described here. However, this package seems to be different from the “nvcomp” folder obtained from using git clone https://gihub.com/NVIDIA/nvcomp.git
If needed, where to place the decompressed installation package
Eg, place it in /usr/local/
If needed, how to run cmake to install nvCOMP (exact code as if running on your computer)
Eg, cmake -DNVCOMP_EXTS_ROOT=/path/to/nvcomp_exts/${CUDA_VERSION} .. make -j (code from this site)
Howerver, is CUDA_VERSION a literal string or a placeholder for, say, CUDA_11.4?
Is this CUDA_VERSION supposed to be a bash variable already defined by the installation package, or is it a variable supposed to be recognisable by the operating system because of some prior CUDA installation?
Besides, what exactly is nvcomp_exts or what does it refer to?
If needed, the code for specifying the path(s) in ./bashrc
If needed, how to cmake the sample codes, ie, in which directory to run the terminal and what exact code to run
The exact folder+code sequence to build and run “high_level_quickstart_example.cpp”, which comes with the installation package.
Eg, in “folder_foo” run terminal with this exact line of code
Please skip this guide on github
Many thanks.
I will answer my own question.
System info
Here is the system information obtained from the command line:
uname -r: 5.15.0-46-generic
lsb_release -a: Ubuntu 20.04.5 LTS
nvcc --version: Cuda compilation tools, release 10.1, V10.1.243
nvidia-smi:
Two Tesla K80 (2-in-1 card) and one GeForce (Gigabyte RTX 3060 Vision 12G rev . 2.0)
NVIDIA-SMI 470.82.01
Driver Version: 470.82.01
CUDA Version: 11.4
cmake --version: cmake version 3.22.5
make --version: GNU Make 4.2.1
lscpu: Xeon CPU E5-2680 V4 # 2.40GHz - 56 CPU(s)
Observation
Although there are two GPUs installed in the server, nvCOMP only works with the RTX.
The Steps
Perhaps "installation" is a misnomer. One only needs to properly compile the downloaded nvCOMP files and run the resulting executables.
Step 1: The nvCOMP library
Download the nvCOMP library from https://developer.nvidia.com/nvcomp.
The file I downloaded was named nvcomp_install_CUDA_11.x.tgz. And I left the extracted folder in the Downloads directory and renamed it nvcomp.
Step 2: The nvCOMP test package on GitHub
Download it from https://github.com/NVIDIA/nvcomp. Click the green "Code" icon, then click "Download ZIP".
By default, the downloaded zip file is called nvcomp-main.zip. And I left the extracted folder, named nvcomp-main, in the Downloads directory.
Step 3: The NIVIDIA CUB library on GitHub
Download it from https://github.com/nvidia/cub. Click the green "Code" icon, then click "Download ZIP".
By default, the downloaded zip file is called cub-main.zip. And I left the extracted folder, named cub-main, in the Downloads directory.
There is no "installation" of the CUB library other than making the folder path "known", ie available, to the calling program.
Comments: The nvCOMP GitHub site did not seem to explain that the CUB library was needed to run nvCOMP, and I only found that out from an error message during an attempted compilation of the test files in Step 2.
Step 4: "Building CPU and GPU Examples, GPU Benchmarks provided on Github"
The nvCOMP GitHub landing page has a section with the exact name as this Step. The instructions could have been more detailed.
Step 4.1: cmake
All in the Downloads directory are the folders nvcomp(the Step 1 nvCOMP library), nvcomp-main (Step 2), and cub-main (Step 3).
Start a terminal and then go inside nvcomp-main, ie, go to /your-path/Downloads/nvcomp-main
Run cmake -DCMAKE_PREFIX_PATH=/your-path/Downloads/nvcomp -DCUB_DIR=/your-path/Downloads/cub-main
This cmake step sets up the build files for the next make" step.
During cmake, a harmless yellow-colored cmake warning appeared
There was also a harmless printout "-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed" per this thread.
The last few printout lines from cmake variously stated it found Threads, nvcomp, ZLIB (on my system) and it was done with "Configuring" and "Build files have been written".
Step 4.2: make
Run make in the same terminal as above.
This is a screenshot of the make compilation.
Please check the before and after folder tree to see what files have been generated.
Step 5: Running the examples/benchmarks
Let's run the "built-in" example before running the benchmarks with the (now outdated) Fannie Mae single-family loan performance data from NVIDIA's RAPIDS repository.
Check if there are executables in /your-path/Downloads/nvcomp-main/bin. These are the excutables created from the cmake and make steps above.
You can try to run these executables on your to-be-compressed files, which are buit with different compression algorithms and functionalities. The name of the executable indicates the algorithm used and/or its functionality.
Some of the executables require the files to be of a certain size, eg, the "benchmark_cascaded_chunked" executable requires the target file's size to be a multiple of 4 bytes. I have not tested all of these executables.
Step 5.1: CPU compression examples
Per https://github.com/NVIDIA/nvcomp
Start a terminal (anywhere)
Run time /your-path/Downloads/nvcomp-main/bin/gdeflate_cpu_compression -f /full-path-to-your-target/my-file.txt
Here are the results of running gdeflate_cpu_compression on an updated Fannie Mae loan data file "2002Q1.csv" (11GB)
Similarly, change the name of the executable to run lz4_cpu_compression or lz4_cpu_decompression
Step 5.2: The benchmarks with the Fannie Mae files from NVIDIA Rapids
Apart from following the NVIDIA instructions here, it seems the "benchmark" executables in the above "bin" directory can be run with "any" file. Just use the executable in the same way as in Step 5.1 and adhere to the particular executable specifications.
Below is one example following the NVIDIA instruction.
Long story short, the nvcomp-main(Step 2) test package contains the files to (i) extract a column of homogeneous data from an outdated Fannie Mae loan data file, (ii) save the extraction in binary format, and (iii) run the benchmark executable(s) on the binary extraction.
The Fannie Mae single-family loan performance data files, old or new, all use "|" as the delimiter. In the outdated Rapids version, the first column, indexed as column "0" in the code (zero-based numbering), contains the 12-digit loan IDs for the loans sampled from the (real) Fannie Mae loan portfolio. In the new Fannie Mae data files from the official Fannie Mae site, the loan IDs are in column 2 and the data files have a csv file extension.
Download the dataset "1 Year" Fannie Mae data, not the "1GB Splits*" variant, by following the link from here, or by going directly to RAPIDS
Place the downloaded mortgage_2000.tgz anywhere and unzip it with tar -xvzf mortgage_2000.tgz.
There are four txt files in /mortgage_2000/perf. I will use Performance_2000Q1.txt as an example.
Check if python is installed on the system
Check if text_to_binary.py is in /nvcomp-main/benchmarks
Start a terminal (anywhere)
As shown below, use the python script to extract the first column, indexed "0", with format long, from Performance_2000Q1.txt, and put the .bin output file somewhere.
Run time python /your-path/Downloads/nvcomp-main/benchmarks/text_to_binary.py /your-other-path-to/mortgage_2000/perf/Performance_2000Q1.txt 0 long /another-path/2000Q1-col0-long.bin
For comparison of the benchmarks, run time python /your-path/Downloads/nvcomp-main/benchmarks/text_to_binary.py /your-other-path-to/mortgage_2000/perf/Performance_2000Q1.txt 0 string /another-path/2000Q1-col0-string.bin
Run the benchmarking executables with the target bin files as shown at the bottom of the web page of the NVIDIA official guide
Eg, /your-path/Downloads/nvcomp-main/bin/benchmark_hlif lz4 -f /another-path/2000Q1-col0-long.bin
Just make sure the operating system know where the executable and the target file are.
Step 5.3: The high_level_quickstart_example and low_level_quickstart_example
These two executables are in /nvcomp-main/bin
They are completely self contained. Just run eg high_level_quickstart_example without any input arguments. Please see corresponding c++ source code in /nvcomp-main/examples and see the official nvCOMP guides on GitHub.
Observations after some experiments
This could be another long thread but let's keep it short. Note that NVIDIA used various A-series cards for its benchmarks and I used a GeForce RTX 3060.
Speed
The python script is slow. It took 4m12.456s to extract the loan ID column from an 11.8 GB Fannie Mae data file (with 108 columns) using format "string"
In contract, R with data.table took 25.648 seconds to do the same.
With the outdated "Performance_2000Q1.txt" (0.99 GB) tested above, the python script took 32.898s whereas R took 26.965s to do the same extraction.
Compression ratio
"Bloated" python outputs.
The R-output "string.txt" files are generally a quarter of the size of the corresponding python-output "string.bin" files.
Applying the executables to the R-output files achieved much better compression ratio and throughputs than to the python-output files.
Eg, running benchmark_hlif lz4 -f 2000Q1-col0-string.bin with the python output vs running benchmark_hlif lz4 -f 2000Q1-col0-string.txt with the R output
Uncompressed size: 436,544,592 vs 118,230,827 bytes
Compressed size: 233,026,108 vs 4,154,261 bytes
Compressed ratio: 1.87 vs 28.46 bytes
Compression throughput (GB/s): 2.42 vs 18.96
decompression throughput (GB/s): 8.86 vs 91.50
Wall time: 2.805 vs 1.281s
Overall performance: accounting for file size and memory limits
Use of the nvCOMP library is limited by the GPU memory, no more than 12GB for the RTX 3060 tested. And depending on the compression algorithm, an 8GB target file can easily trigger a stop with cudaErrorMemoryAllocation: out of memory
In both speed and compression ratio, pigz trumped the tested nvCOMP excutables when the target files were the new Fannie Mae data files containing 108 columns of strings and numbers.
First, I created a 1GB file and transfer into /target folder, then I compressed the file using 7z a targer.7z target.
Later I append hello string to the tail of the 1GB file. When I re-compressed the /target folder using update option 7z u target.7z target; I observe that updated file is compressed all over again instead of compressing only its updated section.
[Q] How could I force 7z to compress only the updated section of the file instead of compressing complete updated file? is there any alternative compression methods to achieve this goal?
Example:
$ mkdir target
$ fallocate -l 1G target/temp_1GB_file
$ time 7z a target.7z target
7-Zip [64] 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,8 CPUs)
Scanning
Updating archive target.7z
Compressing target/temp_1GB_file
Compressing target/target.7z
Everything is Ok
real 0m23.054s
user 0m30.316s
sys 0m1.047s
$ echo 'hello' >> target/temp_1GB_file
$ time 7z u target.7z target # Here complete file has been compressed all over again.
7-Zip [64] 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,8 CPUs)
Scanning
Updating archive target.7z
Compressing target/temp_1GB_file
Everything is Ok
real 0m23.861s
user 0m30.781s
sys 0m1.192s
Here, as you can see, I appended file with 'hello' string, and instead of compressing 'hello's located file-block and merge with the already compressed 1GB file, complete file has be re-compressed again.
7z is not designed for that.
You can look at the gzlog.h and gzlog.c code for an example of how to append short messages efficiently to a compressed file.
I want to exchange 16kHz pcm --> 48kHz wav using sox.
however, pcm file isn't applied in sox.
so, I just changed pcm to raw,
and then
sox -r 16000 -e signed -b 16 -c 1 test.raw -r 48000 out.wav
Can I apply for pcm file not convert raw?
For the PCM file, since PCM's are headerless, you need to add '-t raw' as the first argument.
sox -t raw -r 16000 -e signed -b 16 -c 1 test.raw -r 48000 out.wav
Try that out.
Also try the different Endian settings; -L; -B; -x
though only use one at a time, and only if not using one doesn't work.
There is no need to convert the input file into raw. Sox can handle pcm files.
sox input.pcm -r 48000 output.wav
The input file can either be a .pcm or .wav.
Since .wav files have a header containing audio metadata (such as sample rate, bit precision, file length, etc), you don't have to pass any information about the input file. Hence, non need to use:
-r 16000 -e signed -b 16 -c 1
Converting pcm to raw you have just stripped down the file header.
I want to extract kernel symbols from a u-boot image
The final goal is to debug syscalls with gdb
The kernel is compiled with CONFIG_DEBUG_INFO=y and gcc is using -g option (I checked)
After make uImage, I've :
# file arch/arm/boot/*
arch/arm/boot/bootp: directory
arch/arm/boot/compressed: directory
arch/arm/boot/Image: data
arch/arm/boot/install.sh: POSIX shell script text executable
arch/arm/boot/Makefile: ASCII English text
arch/arm/boot/uImage: u-boot legacy uImage, Linux-3.0.6, Linux/ARM, OS Kernel Image (Not compressed), 3044476 bytes, Thu Mar 22 18:13:40 2012, Load Address: 0x00008000, Entry Point: 0x00008000, Header CRC: 0xF689B805, Data CRC: 0x6BFE76DF
arch/arm/boot/zImage: data
gdb cannot load uImage directly
I tried this script http://forum.xda-developers.com/showthread.php?t=901152.
# file arch/arm/boot/zImage_unpacked/*
arch/arm/boot/zImage_unpacked/decompression_code: data
arch/arm/boot/zImage_unpacked/initramfs.cpio+part3: data
arch/arm/boot/zImage_unpacked/kernel.img: data
arch/arm/boot/zImage_unpacked/padding_piggy: data
arch/arm/boot/zImage_unpacked/piggy: data
arch/arm/boot/zImage_unpacked/piggy.gz: gzip compressed data, from Unix, max compression
arch/arm/boot/zImage_unpacked/piggy.gz+piggy_trailer: gzip compressed data, from Unix, max compression
arch/arm/boot/zImage_unpacked/piggy_trailer: data
arch/arm/boot/zImage_unpacked/sizes: ASCII text
kernel.img is not loadable by gdb
Do make vmlinux. I believe GDB can read that, but it's a long time and a lot of kernel versions since I tried.
EDIT: Oh, I should say, both vmlinux and uImage should be the same but packaged differently. If that's not the case, then this won't work.