compress image before saving without changing ratio in dotnet core - compression

in my code I take image as input.
My user does not know much about technology , so they expect to give any image size they want , but I have limitations and I can't save larger images in my disc, so I want to compress images before saving them in my disc .
I appreciate helping me with this compression.

That's how i solved my problem using Magick.NET library :
i used the quality property to reduce the size and i did it in a loop , while the image size reaches my goal size :
MagickImage magickImge = new MagickImage(image);
int quality = magickImge.Quality;
long length = image.Length;
while (length > fileSizeInByte)
{
if (quality <= 6)
{
return Error;
}
quality = quality - ((quality * 15) / 100);
MagickImage lowQualityImage = new MagickImage(image);
lowQualityImage.Quality = quality;
byte[] newImageBytes = lowQualityImage.ToByteArray();
if (newImageBytes.Length <= fileSizeInByte)
{
magickImge.Quality = quality;
length = newImageBytes.Length;
}
}
i see that if the quality goes below 6 my pic will be illegible

Related

NVencs Output Bitstream is not readable

I have one question related to Nvidias NVenc API. I want to use the API to encode some OpenGL graphics. My problem is, that the API reports no error throughout the whole program, everything seems to be fine. But the generated output is not readable by, e.g. VLC. If I try to play the generated file, VLC would flash a black screen for about 0.5s, then ends the playback.
The Video has the length of 0, the size of the Vid seems rather small, too.
Resolution is 1280*720 and the size of 5secs recording is only 700kb. Is this realistic?
The flow of the application is as following:
Render to secondary Framebuffer
Download Framebuffer to one of two PBOs (glReadPixels())
Map the PBO of the previous frame, to get a pointer understandable by Cuda.
Call a simple CudaKernel converting OpenGLs RGBA to ARGB which should be understandable by NVenc according to this(p.18). The kernel reads the content of the PBO and writes the converted content into a CudaArray (created with cudaMalloc) which is registered as InputResource with NVenc.
The content of the converted Array gets encoded. A completion event plus the corresponding output bitstream buffer get queued.
A secondary thread listens on the queued output events, if one event is signaled, the Output Bitstream gets mapped and written to hdd.
The initializion of NVenc-Encoder:
InitParams* ip = new InitParams();
m_initParams = ip;
memset(ip, 0, sizeof(InitParams));
ip->version = NV_ENC_INITIALIZE_PARAMS_VER;
ip->encodeGUID = m_encoderGuid; //Used Codec
ip->encodeWidth = width; // Frame Width
ip->encodeHeight = height; // Frame Height
ip->maxEncodeWidth = 0; // Zero means no dynamic res changes
ip->maxEncodeHeight = 0;
ip->darWidth = width; // Aspect Ratio
ip->darHeight = height;
ip->frameRateNum = 60; // 60 fps
ip->frameRateDen = 1;
ip->reportSliceOffsets = 0; // According to programming guide
ip->enableSubFrameWrite = 0;
ip->presetGUID = m_presetGuid; // Used Preset for Encoder Config
NV_ENC_PRESET_CONFIG presetCfg; // Load the Preset Config
memset(&presetCfg, 0, sizeof(NV_ENC_PRESET_CONFIG));
presetCfg.version = NV_ENC_PRESET_CONFIG_VER;
presetCfg.presetCfg.version = NV_ENC_CONFIG_VER;
CheckApiError(m_apiFunctions.nvEncGetEncodePresetConfig(m_Encoder,
m_encoderGuid, m_presetGuid, &presetCfg));
memcpy(&m_encodingConfig, &presetCfg.presetCfg, sizeof(NV_ENC_CONFIG));
// And add information about Bitrate etc
m_encodingConfig.rcParams.averageBitRate = 500000;
m_encodingConfig.rcParams.maxBitRate = 600000;
m_encodingConfig.rcParams.rateControlMode = NV_ENC_PARAMS_RC_MODE::NV_ENC_PARAMS_RC_CBR;
ip->encodeConfig = &m_encodingConfig;
ip->enableEncodeAsync = 1; // Async Encoding
ip->enablePTD = 1; // Encoder handles picture ordering
Registration of CudaResource
m_cuContext->SetCurrent(); // Make the clients cuCtx current
NV_ENC_REGISTER_RESOURCE res;
memset(&res, 0, sizeof(NV_ENC_REGISTER_RESOURCE));
NV_ENC_REGISTERED_PTR resPtr; // handle to the cuda resource for future use
res.bufferFormat = m_inputFormat; // Format is ARGB
res.height = m_height;
res.width = m_width;
// NOTE: I've set the pitch to the width of the frame, because the resource is a non-pitched
//cudaArray. Is this correct? Pitch = 0 would produce no output.
res.pitch = pitch;
res.resourceToRegister = (void*) (uintptr_t) resourceToRegister; //CUdevptr to resource
res.resourceType =
NV_ENC_INPUT_RESOURCE_TYPE::NV_ENC_INPUT_RESOURCE_TYPE_CUDADEVICEPTR;
res.version = NV_ENC_REGISTER_RESOURCE_VER;
CheckApiError(m_apiFunctions.nvEncRegisterResource(m_Encoder, &res));
m_registeredInputResources.push_back(res.registeredResource);
Encoding
m_cuContext->SetCurrent(); // Make Clients context current
MapInputResource(id); //Map the CudaInputResource
NV_ENC_PIC_PARAMS temp;
memset(&temp, 0, sizeof(NV_ENC_PIC_PARAMS));
temp.version = NV_ENC_PIC_PARAMS_VER;
unsigned int currentBufferAndEvent = m_counter % m_registeredEvents.size(); //Counter is inc'ed in every Frame
temp.bufferFmt = m_currentlyMappedInputBuffer.mappedBufferFmt;
temp.inputBuffer = m_currentlyMappedInputBuffer.mappedResource; //got set by MapInputResource
temp.completionEvent = m_registeredEvents[currentBufferAndEvent];
temp.outputBitstream = m_registeredOutputBuffers[currentBufferAndEvent];
temp.inputWidth = m_width;
temp.inputHeight = m_height;
temp.inputPitch = m_width;
temp.inputTimeStamp = m_counter;
temp.pictureStruct = NV_ENC_PIC_STRUCT_FRAME; // According to samples
temp.qpDeltaMap = NULL;
temp.qpDeltaMapSize = 0;
EventWithId latestEvent(currentBufferAndEvent,
m_registeredEvents[currentBufferAndEvent]);
PushBackEncodeEvent(latestEvent); // Store the Event with its ID in a Queue
CheckApiError(m_apiFunctions.nvEncEncodePicture(m_Encoder, &temp));
m_counter++;
UnmapInputResource(id); // Unmap
Every little hint, where to look at, is very much appreciated. I'm running out of ideas what might be wrong.
Thanks a lot!
With the help of hall822 from the nvidia forums I managed to solve the issue.
The primary error was that I registered my cuda resource with a pitch equal to the size of the frame. I'm using a Framebuffer-Renderbuffer to draw my content into. The data of this is a plain, unpitched array. My first thought, giving a pitch equal to zero, failed. The encoder did nothing. The next idea was to set it to the width of the frame, a quarter of the image was encoded.
// NOTE: I've set the pitch to the width of the frame, because the resource is a non-pitched
//cudaArray. Is this correct? Pitch = 0 would produce no output.
res.pitch = pitch;
To answer this question: Yes, it is correct. But the pitch is measured in byte. So because I'm encoding RGBA-Frames, the correct pitch has to be FRAME_WIDTH * 4.
The second error was that my color channels were not right (See point 4 in my opening post). The NVidia enum says that the encoder expects the channels in ARGB format but actually ment is BGRA, so the alpha channel which is always 255 polluted the blue channel.
Edit: This may be due to the fact that NVidia is using little endian internally. I'm writing
my pixel data to a byte array, choosing an other type like int32 may allow one to pass actual ARGB data.

Can't display a PNG using Glut or OpenGL

Code is here:
void readOIIOImage( const char* fname, float* img)
{
int xres, yres;
ImageInput *in = ImageInput::create (fname);
if (! in) {return;}
ImageSpec spec;
in->open (fname, spec);
xres = spec.width;
yres = spec.height;
iwidth = spec.width;
iheight = spec.height;
channels = spec.nchannels;
cout << "\n";
pixels = new float[xres*yres*channels];
in->read_image (TypeDesc::FLOAT, pixels);
long index = 0;
for( int j=0;j<yres;j++)
{
for( int i=0;i<xres;i++ )
{
for( int c=0;c<channels;c++ )
{
img[ (i + xres*(yres - j - 1))*channels + c ] = pixels[index++];
}
}
}
in->close ();
delete in;
}
Currently, my code produces JPG files fine. It has the ability to read the file's information, and display it fine. However, when I try reading in a PNG file, it doesn't display correctly at all. Usually, it kind of displays the same distorted version of the image in three separate columns on the display. It's very strange. Any idea why this is happening with the given code?
Additionally, the JPG files all have 3 channels. The PNG has 2.
fname is simply a filename, and img is `new float[3*size];
Any help would be great. Thanks.`
Usually, it kind of displays the same distorted version of the image in three separate columns on the display. It's very strange. Any idea why this is happening with the given code?
This reads a lot like the output you get from the decoder is in row-planar format. Planar means, that you get individual rows one for every channel one-after another. The distortion and the discrepancy between number of channels in PNG and apparent count of channels are likely due to alignment mismatch. Now you didn't specify which image decoder library you're using exactly, so I can't look up information in how it communicates the layout of the pixel buffer. I suppose you can read the necessary information from ImageSpec.
Anyway, you'll have to rearrange your pixel buffer rearrangement loop indexing a bit so that consecutive row-planes are interleaved into channel-tuples.
Of course you could as well use a ready to use imagefile-to-OpenGL reader library. DevIL is thrown around a lot, but it's not very well maintained. SOIL seems to be a popular choice these days.

Firemonkey: Shrink text font to fit in TLabel

I am attempting to lower the font size of a TLabel if its text is to large to fit in the confines of the label. I didn't see any properties I could set on the label to achieve this, so I have tried writing my own method. My method works by using TCanvas.TextWidth to measure the width of the text in a label, and shrink the font until the width of the text fits within the width of the label.
void __fastcall ShrinkFontToFitLabel( TCanvas * Canvas, TLabel * Label )
{
float NewFontSize = Label->Font->Size;
Canvas->Font->Family = Label->Font->Family;
Canvas->Font->Size = NewFontSize;
while( Canvas->TextWidth( Label->Text ) > Label->Width && NewFontSize > MinimumFontSize )
{
NewFontSize -= FontSizeDecrement;
Canvas->Font->Size = NewFontSize;
}
Label->Font->Size = NewFontSize;
}
This works some of the time, however other times it does not shrink the font near enough. It seems as if the value I get from calling Canvas->TextWidth is a lot of times, much smaller than the number of pixels wide the label actually needs to be in order to fit the text.
Am I using Canvas->TextWidth incorrectly? Is there a better way to calculate the width of a string, or to re-size the font of a TLabel so its text fits within its demensions?
Edit:
In this case, I am passing in to my function, the TCanvas that my label is sitting in. I have tried using that TCanvas as well as Label->Canvas. Both give me the same number for text width, and both are short of the actual value in pixels needed to display the whole string.
The following code is taken from code that works in an FMX application, modified slightly to remove arrays that are being iterated through and declaring a variable locally to the function. It is being run in a TForm method. Canvas here is the Form's Canvas. You can see that I'm using "- 35" at one point - this might be because the numbers weren't quite right.
double InitialFontSize = 30;
Canvas->Font->Size = InitialFontSize;
StoryHeadlineLabel->Font->Size = InitialFontSize;
bool fits = false;
do
{
double widthA = Canvas->TextWidth (StoryHeadlineLabel->Text);
if (widthA > StoryHeadlineLabel->Width - 35)
{
StoryHeadlineLabel->Font->Size --;
Canvas->Font->Size --;
}
else
fits = true;
if (StoryHeadlineLabel->Font->Size < 6)
fits = true;
} while (!fits);

Query maximum webcam resolution in OpenCV

I'm dealing with several types of cameras and I need to know the maximum resolution each one is capable.
Is there a way to query such property in OpenCV?
If not, is there any other way? The application will work under Windows (by the moment) and all the project is being developed using C++.
A trick that's working for me:
Just set to a very high resolution (above the capabilities of any usual capture device), then get the current resolution.
You will see that the device will automatically switch to his maximum value.
Code example in Python with OpenCV 3.0:
HIGH_VALUE = 10000
WIDTH = HIGH_VALUE
HEIGHT = HIGH_VALUE
self.__capture = cv2.VideoCapture(0)
fourcc = cv2.VideoWriter_fourcc(*'XVID')
self.__capture.set(cv2.CAP_PROP_FRAME_WIDTH, WIDTH)
self.__capture.set(cv2.CAP_PROP_FRAME_HEIGHT, HEIGHT)
width = int(self.__capture.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(self.__capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
Hope it helps.
FINAL SOLUTION
As the accepted answered by user2949634 was written in Python, I'm posting the equivalent implementation in C++ for completeness:
void query_maximum_resolution(cv::VideoCapture* camera, int& max_width, int& max_height)
{
// Save current resolution
const int current_width = static_cast<int>(camera->get(CV_CAP_PROP_FRAME_WIDTH));
const int current_height = static_cast<int>(camera->get(CV_CAP_PROP_FRAME_HEIGHT));
// Get maximum resolution
camera->set(CV_CAP_PROP_FRAME_WIDTH, 10000);
camera->set(CV_CAP_PROP_FRAME_HEIGHT, 10000);
max_width = static_cast<int>(camera->get(CV_CAP_PROP_FRAME_WIDTH));
max_height = static_cast<int>(camera->get(CV_CAP_PROP_FRAME_HEIGHT));
// Restore resolution
camera->set(CV_CAP_PROP_FRAME_WIDTH, current_width);
camera->set(CV_CAP_PROP_FRAME_HEIGHT, current_height);
}
VideoCapture::get(int propId)
Passing in CV_CAP_PROP_FRAME_WIDTH and CV_CAP_PROP_FRAME_HEIGHT will get you the resolution.
For getting the maximum possible resolution, all the functionality for cv::VideoCapture is in that link. There does not seem to be a possible way to do that directly, probably because many cameras expect you to know the possible resolutions from the manual and to set some flags to toggle what you want. One thing you can try is to keep a list of all common resolutions, then try all of them for each camera with VideoCapture::set while checking the return value to see if it was successful. There aren't many resolutions to search, so this should be viable.
Searching on same topic myself :
As previously answered there is no direct property, but you could find supported resolutions trying to figure out accepted camera resolutions :
trying all possible common resolutions
probing minimum resolution and increasing width/height
Here C++ OpenCV 2.4.8/Windows tested code sample
trying common resolutions solution :
const CvSize CommonResolutions[] = {
cvSize(120, 90),
cvSize(352, 240),
cvSize(352, 288),
// and so on
cvSize(8192, 4608)
};
vector<CvSize> getSupportedResolutions(VideoCapture camera)
{
vector<CvSize> supportedVideoResolutions;
int nbTests = sizeof(CommonResolutions) / sizeof(CommonResolutions[0]);
for (int i = 0; i < nbTests; i++) {
CvSize test = CommonResolutions[i];
// try to set resolution
camera.set(CV_CAP_PROP_FRAME_WIDTH, test.width);
camera.set(CV_CAP_PROP_FRAME_HEIGHT, test.height);
double width = camera.get(CV_CAP_PROP_FRAME_WIDTH);
double height = camera.get(CV_CAP_PROP_FRAME_HEIGHT);
if (test.width == width && test.height == height) {
supportedVideoResolutions.push_back(test);
}
}
return supportedVideoResolutions;
}
Probing solution based on width increment :
vector<CvSize> getSupportedResolutionsProbing(VideoCapture camera)
{
vector<CvSize> supportedVideoResolutions;
int step = 100;
double minimumWidth = 16; // Microvision
double maximumWidth = 1920 + step; // 1080
CvSize currentSize = cvSize(minimumWidth, 1);
CvSize previousSize = currentSize;
while (1) {
camera.set(CV_CAP_PROP_FRAME_WIDTH, currentSize.width);
camera.set(CV_CAP_PROP_FRAME_HEIGHT, currentSize.height);
CvSize cameraResolution = cvSize(
camera.get(CV_CAP_PROP_FRAME_WIDTH),
camera.get(CV_CAP_PROP_FRAME_HEIGHT));
if (cameraResolution.width == previousSize.width
&& cameraResolution.height == previousSize.height)
{
supportedVideoResolutions.push_back(cameraResolution);
currentSize = previousSize = cameraResolution;
}
currentSize.width += step;
if (currentSize.width > maximumWidth)
{
break;
}
}
return supportedVideoResolutions;
}
I hope this will be helpful for futur users.
For Ubuntu: install
sudo apt install v4l-utils
And then, run:
v4l2-ctl -d /dev/video0 --list-formats-ext
This kind of hardware capabilities can be queried from USB devices if the camera is UVC compliant. It depends on the driver / firmware of the device. See for example these Microsoft requirements to guess what kind of support you can expect on Windows platforms.

Using time in OpenCV for frame processes and other tasks

I want to count the vehicles from a video. After frame differencing I got a gray scale image or kind of binary image. I have defined a Region of Interest to work on a specific area of the frames, the values of the pixels of the vehicles passing through the Region of Interest are higher than 0 or even higher than 40 or 50 because they are white.
My idea is that when a certain number of pixels in a specific interval of time (say 1-2 seconds) are white then there must be a vehicle passing so I will increment the counter.
What I want is, to check whether there are still white pixels coming or not after a 1-2 seconds. If there are no white pixels coming it means that the vehicle has passed and the next vehicle is gonna come, in this way the counter must be incremented.
One method that came to my mind is to count the frames of the video and store it in a variable called No_of_frames. Then using that variable I think I can estimate the time passed. If the value of the variable No_of_frames is greater then lets say 20, it means that nearly 1 second has passed, if my videos frame rate is 25-30 fps.
I am using Qt Creator with windows 7 and OpenCV 2.3.1
My code is something like:
for(int i=0; i<matFrame.rows; i++)
{
for(int j=0;j<matFrame.cols;j++)
if (matFrame.at<uchar>(i,j)>100)//values of pixels greater than 100
//will be considered as white.
{
whitePixels++;
}
if ()// here I want to use time. The 'if' statement must be like:
//if (total_no._of_whitepixels>100 && no_white_pixel_came_after 2secs)
//which means that a vehicle has just passed so increment the counter.
{
counter++;
}
}
Any other idea for counting the vehicles, better than mine, will be most welcomed. Thanks in advance.
For background segmentation I am using the following algorithm but it is very slow, I don't know why. The whole code is as follows:
// opencv2/video/background_segm.hpp OPENCV header file must be included.
IplImage* tmp_frame = NULL;
CvCapture* cap = NULL;
bool update_bg_model = true;
Mat element = getStructuringElement( 0, Size( 2,2 ),Point() );
Mat eroded_frame;
Mat before_erode;
if( argc > 2 )
cap = cvCaptureFromCAM(0);
else
// cap = cvCreateFileCapture( "C:\\4.avi" );
cap = cvCreateFileCapture( "C:\\traffic2.mp4" );
if( !cap )
{
printf("can not open camera or video file\n");
return -1;
}
tmp_frame = cvQueryFrame(cap);
if(!tmp_frame)
{
printf("can not read data from the video source\n");
return -1;
}
cvNamedWindow("BackGround", 1);
cvNamedWindow("ForeGround", 1);
CvBGStatModel* bg_model = 0;
for( int fr = 1;tmp_frame; tmp_frame = cvQueryFrame(cap), fr++ )
{
if(!bg_model)
{
//create BG model
bg_model = cvCreateGaussianBGModel( tmp_frame );
// bg_model = cvCreateFGDStatModel( temp );
continue;
}
double t = (double)cvGetTickCount();
cvUpdateBGStatModel( tmp_frame, bg_model, update_bg_model ? -1 : 0 );
t = (double)cvGetTickCount() - t;
printf( "%d. %.1f\n", fr, t/(cvGetTickFrequency()*1000.) );
before_erode= bg_model->foreground;
cv::erode((Mat)bg_model->background, (Mat)bg_model->foreground, element );
//eroded_frame=bg_model->foreground;
//frame=(IplImage *)erode_frame.data;
cvShowImage("BackGround", bg_model->background);
cvShowImage("ForeGround", bg_model->foreground);
// cvShowImage("ForeGround", bg_model->foreground);
char k = cvWaitKey(5);
if( k == 27 ) break;
if( k == ' ' )
{
update_bg_model = !update_bg_model;
if(update_bg_model)
printf("Background update is on\n");
else
printf("Background update is off\n");
}
}
cvReleaseBGStatModel( &bg_model );
cvReleaseCapture(&cap);
return 0;
A great deal of research has been done on vehicle tracking and counting. The approach you describe appears to be quite fragile, and is unlikely to be robust or accurate. The main issue is using a count of pixels above a certain threshold, without regard for their spatial connectivity or temporal relation.
Frame differencing can be useful for separating a moving object from its background, provided the object of interest is the only (or largest) moving object.
What you really need is to first identify the object of interest, segment it from the background, and track it over time using an adaptive filter (such as a Kalman filter). Have a look at the OpenCV video reference. OpenCV provides background subtraction and object segmentation to do all the required steps.
I suggest you read up on OpenCV - Learning OpenCV is a great read. And also on more general computer vision algorithms and theory - http://homepages.inf.ed.ac.uk/rbf/CVonline/books.htm has a good list.
Normally they just put a small pneumatic pipe across the road (soft pipe semi filled with air). It is attached to a simple counter. Each vehicle passing over the pipe generates two pulses (first front, then rear wheels). The counter records the number of pulses in specified time intervals and divides by 2 to get the approx vehicle count.