Does the display buffer of an operating system still exist when there is no GPU or monitor attached to a PC? - c++

By Display Buffer, i mean the buffer in which the screenshot of the monitor ,i.e. the entire monitor screen, is stored.
I am trying to design a software which continuously captures the screen images and assembles them into a video.
What i want to know is:-
1) Where exactly are the pixels of the image which appears on my monitor stored in my computer(GPU MEMORY, RAM)?
2) Is this image still present when there is no gpu and monitor connected to the PC?
EDIT : I am targeting windows OS.

If there is a "display driver" of some sort, then there will be some form of framebuffer (what gets displayed). Exactly what form that takes, and how you access it would be OS dependent. But in most systems, you need at least some form of graphics processor (or display processor - which is basically just the basic part that scans the framebuffer and pulses out the pixels one at a time in sync with the frequency the monitor (or LCD panel on for example a phone) expects them) to actually feed the data to a monitor, and if that doesn't exist, a "display driver" would be pretty pointless. However, there are systems that use network displays (e.g. "Remote Desktop" for Windows), where the display buffer is entirely software driven, so the graphics displayed is calculated using the CPU only, and there is no requirement to have GPU or DP hardware.
Of course, the frame buffer is not necessarily stored "as it appears on screen" - for example, one could store all the green pixels in one "plane", all the red pixels in another "plane", the "blue" in a third plane, and any "alpha" in a fourth plane. Or in groups if 8 x 16 pixels of each colour. Or in a compressed form (e.g. for software usage over "Remote desktop" type connection, one could imagine a "run length encoded" frame buffer).
The most obvious case of "hardware less" display would probably be in a Virtual Machine. Of course, that's not truly hardware less, it's just that the graphics processor itself (or "dislay adapter", which is really a display processor) is entirely software-based, but pretends to be hardware. However, since the display processor is very simple, all the actual frame buffer operations are done in software inside the VM itself, all the display adapter does is keep track of where the frame buffer is and send it out to the "viewer" (typically as a "Remote Desktop", "VNC" or similar network protocol for viewing the display of a "remote" computer).
To answer your specific questions:
Yes one of those, most likely, but could be "something else" too. Exactly how the frame buffer is stored is entirely up to the design of the whole system. Without knowing the exact hardware (e.g. MacBook Pro, PC with Graphics card X, embedded system, smart phone type Y, etc) and software combination (Windows, Linux, custom OS=, there is nothing we can say about how it is stored, as there are as many solutions as you can possibly imagine (if you have VERY good imagination, otherwise there are a few more).
If there is no monitor connected, certainly - I'm not aware of any system where the frame buffer is disabled due to "no monitor". The electrical signals on the VGA/DVI/HDMI connector are typically turned off to reduce interference signals when the monitor is not connected, but the image is still being generated, and appear instantly if you plug in a monitor. As explained at the top, a frame buffer doesn't require a GPU - it's just easier/faster.

Related

RPI OpenGL PWM display driver

So I'm building a system based on a raspberry pi 4 running Linux (image created through buildroot) driving a Led matrix (64x32 RGB connectors) and I'm very confused about the software stack of linux. I'd like to be able to use OpenGL capabilities on a small resolution screen that would then be transfered to a driver that would actually drive the Led matrix.
I've read about DRM, KMS, GEM and other systems and I've concluded the best way to go about it would be to have the following working scheme:
User space: App
| OpenGL
v
Kernel space: DRM -GEM-> Led device driver
|
v
Hardware: Led Matrix
Some of this may not make a lot of sense since the concepts are still confusing to me.
Essentially, the app would make OpenGL calls that would generate frames that could be mapped to buffers on the DRM which could be shared with the Led device driver which would then drive the leds in the matrix.
Would something like this be the best way about it?
I could just program some dumb buffer cpu implementation but I'd rather take this as a learning experience.
OpenGL renders into a buffer (called "framebuffer" that is usually displayed onto the screen. But rendering into an off screen buffer (as the name implies) does not render onto the screen but into an array, which can be read by C/C++. There is one indirection on modern operating systems. Usually you have multiple windows visible on your screen. Therefore the application can't render onto the screen itself but into a buffer maneged by the windowing system, which is then composited into one final image. Linux uses Wayland, multiple Wayland clients can create and draw into the Wayland compositor's buffers.
If you only want to display your application just use a off screen buffer.
If you want to display another application read it's framebuffer by writing your own Wayland compositor. Note this may be hard (I've never done that) if you want to use hardware acceleration.

Video Output in Protected/Long Mode

I am Lukas and I have question about plotting pixels on screen in protected/long mode and video generation is OS in general. My question is that, how I can display something on screen in high resolution, such as 1920*1080, or better for me, in 1680*1050 ( because of my little bit old monitor), and how I can make specific driver for my video card - Intel Integrated HD 620 on main computer. On my dev Server I have some integrated VGA controller on motherboard - Intel 4 Series Chipset Integrated Graphics controller ( rev 3 ) - I think to control this specific card I just need standard VGA Controller stuff - like it's ports, DAC and so on, but I don't know how to make driver for my external GPU ( I mean not Integrated on motherboard ), which is Fujitsu 4 Series Chipset Integrated Graphics Controller, and where I can get information about it, and also where can I get information about this whole subject and maybe some tutorial. Thanks very much for your help!
PS: Sorry for my English, I am not Englishman.
My question is that, how I can display something on screen in high resolution, such as 1920*1080, or better for me, in 1680*1050 ( because of my little bit old monitor), and how I can make specific driver for my video card, and where I can get information about it, and also where can I get information about this whole subject and maybe some tutorial. Thanks very much for your help!
The first problem is setting a usable video mode. For this there are 3 cases:
use the BIOS/VBE functions. This is easy enough in early boot code, but horribly messy after boot.
use UEFI functions (GOP, UGA). This is easy in early boot code, but impossible after boot.
write native video drivers for every possible video card. This is impossible - you can't write drivers for video cards that don't exist yet, so every time a new video card is released there will be no driver for however long it takes to write one (and for "sole developer" you will never have time to write one).
The "most sane" choice is "boot loader sets up default video mode using whatever firmware provides and gives OS details for frame buffer (then native video driver may be used to change video modes after early boot if there ever is a suitable native video driver)".
Note that for all of these cases (BIOS, UEFI and native driver) there's a selection process involved - you want to get information from the monitor describing what it supports, and information from the video card about what it supports, and information from the OS about what it supports; and then use that all the information to find the best video mode that is supported by everything. You don't want to setup a 1920*1600 video mode just because the video card supports it (and then have your old monitor showing a black screen because it doesn't support that specific video mode).
For putting a pixel; the formula is mostly "address = video_frame_buffer_address + y * bytes_per_line + x * bytes_per_pixel"; where video_frame_buffer_address is the virtual address for wherever you felt like mapping the frame buffer; and the physical address of the frame buffer, and the values for bytes_per_line and bytes_per_pixel, are details that will come from BIOS or UEFI or a native video driver.
For displaying anything on the screen, putting pixels like this is a huge performance disaster (you don't want the overhead of "address = video_frame_buffer_address + y * bytes_per_line + x * bytes_per_pixel" calculation for every pixel). Instead, you want higher level functions (e.g. to draw characters, draw lines, fill rectangles, ...) so that you can calculate a starting address once, then adjust that address as you draw instead of doing the full calculation again. For example; for drawing a rectangle you might end up with something vaguely like "for each horizontal line in rectangle { memset(address, colour, width); address += bytes_per_line; }".
However; you should also know that (for increasing the chance that your code will work on more different computers) you will need to support multiple different color depths and pixel formats; and if you have 10 different drawing functions (to draw characters, lines, rectangles, ..) and support 10 different colour depths/pixel formats; then it adds up to a 100 different functions. An easier alternative is to have a generic pixel format (e.g. "32-bit per pixel ARGB") and do all the drawing to a buffer in RAM using that generic pixel format and then have functions to blit data from the buffer in RAM to the frame buffer while converting the data to whatever the video mode actually wants.

OSX pushing pixels to screen with minimum latency

I'm trying to develop some very low-latency graphics applications and am getting really frustrated by how long it takes to draw to screen through OpenGL. Every discussion I find about it online addresses optimizing the OpenGL pipeline, but doesn't get anywhere near the results that I need.
Check this out:
https://www.dropbox.com/s/dbz4bq67cxluhs7/MouseLatency.MOV?dl=0
You probably noticed this before: With a c++ OpenGL app, dragging the mouse around the screen, and drawing the mouse location in OpenGL, the OpenGL lags behind by 3 or 4 frames. Clearly OSX CAN draw [the cursor] to screen with very low latency, but OpenGL is much slower. So let's say I don't need to do any fancy OpenGL rendering. I just want to push pixels to screen somehow. Is there a way for me to bypass OpenGL completely and draw to screen faster? Or is this kind of functionality going to be locked inside the kernel somewhere that I can't reach it?
datenwolf's answer is excellent. I just wanted to add one thing to this discussion regarding triple buffering at the compositor level, since I am very familiar with the Microsoft Windows desktop compositor.
I know you are asking about OS X here, but the implementation details I am going to discuss are the most sensible way of implementing this stuff and I would expect to see other systems work this way too.
Triple buffering as you might enable at the application level adds a third buffer to the swap-chain that is synchronized to refresh. That way of doing triple buffering does add latency, because that third buffer has to be displayed and nothing is allowed to touch it until this happens (this is D3D's mandated behavior -- the behavior and feature itself are undefined in OpenGL); but the way the Desktop Window Manager (Windows) works is slightly different.
The behavior I have seen most drivers implement for desktop composition is frame dropping. Any situation where multiple frames are finished between refreshes, all but 1 of those frames are discarded. You actually get lower latency using a window rather than fullscreen + triple buffering, because it does not block buffer swaps when the third buffer (owned by the compositor) has a finished frame waiting to be displayed.
It creates a whole different set of visual issues if framerate is not reasonably consistent. Technically, pixels belonging to dropped frames have infinite latency, so the benefits from latency reduction done this way might be worthless if you needed every single frame drawn to appear on screen.
I believe you can get this behavior on OS X (if you want it) by disabling VSYNC and drawing in a window. VSYNC basically only serves as a form of frame pacing (trade latency for consistency) in this scenario and tearing is eliminated by the compositor itself regardless what rate you draw at.
Regarding mouse cursor latency:
The cursor in any modern window system will always track with minimum latency. There is literally a feature on graphics hardware called a "hardware cursor," where the driver stores the cursor position and then once per-refresh, has the hardware overlay the cursor on top of whatever is sitting in the framebuffer waiting to be scanned-out. So even if your application is drawing at 30 FPS on a 60 Hz display, the cursor is updated every 16 ms when the hardware cursor's used.
This bypasses all graphics APIs altogether, but is quite limited (e.g. it uses the OS-defined cursor).
TL;DR: Latency comes in many forms.
If your problem is input latency, then you can mitigate that by reducing the number of pre-rendered frames and avoiding triple buffering. I could not begin to tell you how to reduce the number of driver pre-rendered frames on OS X.
Minimize length of time before something shows up on screen
If your problem is the amount of time that passes between executions of your render loop, you would go the other way. Increase pre-rendered frames, draw in a window and disable VSYNC. You may run into a lot of frames that are drawn but never displayed in this scenario.
Minimize time spent blocking (increase FPS); some frames will never be displayed
Pre-rendered frames are a powerful little feature that you do not get control over at the OpenGL API level. It sets up how deeply the driver is allowed to pipeline everything and depending on the desired task you will trade different types of latency by fiddling with it. Many gamers swear by setting this value to 1 to minimize input latency at the cost of overall framerate "smoothness."
UPDATE:
Pre-rendered frames are one reason for your multi-frame delay. Fixing this in a cross-platform way is difficult (it's a driver setting), but if you have access to Fence Sync Objects you can produce the same behavior as forcing this to 1.
I can explain this in more detail if need be, the general idea is that you insert a fence sync after the buffer swap and then wait for it to be signaled before the first command in the next frame is allowed to begin. Performance may take a nose dive, but latency will be minimized since the CPU won't be rendering ahead of the GPU anymore.
There are a number of latencies at play here.
Input event → drawing state latency
In your typical interactive application you have a event loop that usually goes
collect user input
process user input
determine what's to be drawn
draw to the back buffer
swap back to front buffer
With the usual ways in which event–update–display loops are written there's almost no delay between step 5 of the previous and step 1 of the following iteration. which means that steps 2, 3, and 4 operate with data that lags about one frame period behind.
So this is the first source of latency.
Tripple buffering / composition latency
Many graphics pipelines enable triple buffering for smoother display update. Instead of keeping only a back and a front buffer around, there's also a third buffer inbetween. The average rate at which to these buffers is drawn is the display refresh period. The buffers themself are stepped at exactly the display refresh period. So this adds another frame period of latency.
If you're running on a system with a window compositor (which is the default by MacOS X) this adds effectively another buffer stage, so if you've got a double buffer mode it gives you triple buffer and if you had a triple buffer it'd give you a "quad" buffer (quotes here, because quad buffer is a term usually used with stereoscopic rendering).
What can you do about this:
Turn off composition
Windows through the DWM API and MacOS X allow to turn off composition or bypass the compositor.
Reducing input lag
Try to collect and integrate the user input as late as possible (use high resolution sleeps). If you've got only a very simple scene you can push the drawing quite close to the V-Sync deadline; in fact the NVidia OpenGL implementation has a vendor specific extension that allows to sleep until a specific amount of time before the next V-Sync.
If your scene is complex but is separable in parts that require low latency user input and stuff where it doesn't matter so much you can draw the higher latency stuff earlier and only at the very last moment integrate user input into it. Of course if the mouse is used to control the viewing direction, or even worse you're rendering for a VR head mounted display things are going to become difficult.

How to draw a pixel by changing video memory map directly in a C program (without library functions)

Is it possible to display a black dot by changing values in the screen(video ie monitor) memory map in RAM using a c program?
I don't want to use any library functions as my primary aim is to learn how to develop a simple OS.
I tried accessing the starting screen memory map ie 0xA0000 (in C).
I tried to run the program but got a Segmentation Fault since no direct access is provided. In super user, the program gets executed without any change.
Currently I am testing in VirtualBox.
A "real" operating system will not use the framebuffer at address 0xA0000, so you can't draw on the screen by writing to it directly. Instead your OS probably has proper video drivers that will talk to the hardware in various very involved ways.
In short there's no easy way to do what you want to do on a modern OS.
On the other hand, if you want to learn how to write your own OS, then it would be very good practice to try to write a minimal kernel that can output to the VGA text framebuffer at 0xB8000 and maybe then the VGA graphic framebuffer at 0xA0000.
You can start using those framebuffers and drawing on the screen almost immediately after the BIOS jumps to your kernel, with a minimal amount of setting up. You could do that directly from real mode in maybe a hundred lines of assembler tops, or perhaps in C with a couple lines of assembler glue first.
Even simpler would be to have GRUB set up the hardware, boot your minimal kernel, and you can directly write to it in a couple lines.
Short answer is no because the frame buffer on modern operating systems is setup as determined by the vbios and kernel driver(s). It depends on amount of VRAM present on the board, the size of the GART, physical Ram present and a whole bunch of other stuff (VRAM reservation, whether it should be visible to CPU or not, etc). On top of this, modern OS's are utilizing multiple back buffers and flipping the HW to display between these buffers, so even if you could directly poke to the frame buffer, the address would change from frame to frame.
If you are interesting in do this for learning purposes, I would recommend creating a simple OGL or D3D (for example) 'function' which takes a 'fake' system allocated frame buffer and presents it to the screen using regular HW operations.
You could even set the refresh up on a timer to fake update.
Then your fake OS would just write pixels to the fake system memory buffer and this rendering function would take care of displaying it as if it were real.

How are pixels drawn at the lowest level

I can use setpixel (GDI) to set any pixel on the screen a colour.
So how would I reproduce Setpixel in in the lowest assembly level. What actually is happening that triggers the instructions that say, ok sens a byte a position x in the framebuffer.
setpixel most probably just calculates address of given pixel using formula:
pixel = (frame_start + y * frame_width) + x
then it simply *pixel = COLOR
You can actually use CreateDIBSection to create your own buffers and associate it with DeviceContext, then you can modify pixels at the low level using formula as above. This is usefull if you have your own graphics library like AGG.
Learning about GDI I like to look into WINE source code, here you can see how complicated it actually is (dibdrv_SetPixel):
http://fossies.org/dox/wine-1.6.1/gdi32_2dibdrv_2graphics_8c_source.html
it must take into account also clipping regions, and also different pixel sizes and probably other features. Also it is possible that some drivers might somehow accelerate this in hardware, but I have not heard of it.
If you want to recreate setpixel you need to know how your graphics hardware works. Most hardware manufacutrres follow at least the VESA standard, see here. This standard specifies that you can set the display mode using interrupt 0x10.
Once the display mode is set the memory region displayed is defined in the standard and you can simply write directly to display memory.
Advanced graphics hardware deviates from the standard (because it only covers the basics). So the above does not work for advanced features. You'll have to resort to the gpu documentation.
The "how" is always depends on "what", what I mean is that for different setups there are different methods, different systems different methods, what is common is that they are usually not allowing you to do it directly i.e. write to a memory address that will be displayed.
Some devices with dedicated setup may allow you to do that ( like some consoles do as far as I know ) but there you will have to do some locking or other utility work to make it work as it should.
Since in modern PCs Graphics Accelerators are fused into the video cards ( one counter example is the Voodoo 1 which needed a video card in order to operate, since it was just a 3D accelerator ) the GPU usually holds the memory that it will draw from the framebuffer in it's own memory making it inaccessible from the outside.
So generally you would say here is a memory address "download" the data into your own GPU memory and show it on screen, and this is where the desktop composition comes in. Since video cards suffer from this transfer all the time it is in fact faster to send the data required to draw and let the GPU do the drawing. So Aero is just a visual but as far as I know the desktop compositor works regardless of Aero making the drawing GPU dependent.
So technically low level functions such as SetPixel are software since Windows7 because of the things I mentioned above so solely because you just cant access the memory directly. So what I think it probably does is that for every HDC there is a bitmap and when you use the set pixel you simply set a pixel in that bitmap that is later sent to the GPU for display.
In case of DOS or other old tech. it is probably just an emulation in the same way it is done for GDI.
So in the light of these,
So how would I reproduce Setpixel in in the lowest assembly level.
it is just probably a copy to a memory location, but windows integrates the window surfaces and it's frambuffer that you will never get direct access. One way to emulate what it does is to make a bitmap get it memory pointer and simple set the pixel manually then tell windows to show this bitmap on screen.
What actually is happening that triggers the instructions that say, ok sens a byte a position x in the framebuffer.
Like I said before it really depends on the environment what is done at the moment you make this call, and the code that needs to be executed comes from different places , some are done by Microsoft some are done by the GPU's manufacturer and these all together produce the result that pixel you see on your screen.
For to set a pixel to the framebuffer using a videomode with 32 bit color we need the address of the pixel and the color of the pixel.
With the address and the color we can simple use a move instruction to set the color to the framebuffer.
Sample with using the EDI-Register as a 32bit addressregister(default segmentregister is DS) for to address the framebuffer with the move instruction.
x86 intel syntax:
mov edi, Framebuffer ; load the address(upper left corner) into the EDI-Register
mov DWORD [edi], Color ; write the color to the address of DS:EDI
The first instruction load the EDI-Register with the address of the framebuffer and the second instruction write the color to the framebuffer.
Hint for to calculate the address of a pixel inside of the frambuffer:
Some Videomodes are using maybe a lorger scanline with more bytes for the horizontal resolution, with a part outside of the visible view.
Dirk