I'm currently trying to download the equirectangular images Google displays in their 360 view of Google street view to an image file so that I may display them in VR in Unreal Engine 4. I've tried a few things -
Requesting the panorama tiles from a constructed URL in the format described in the Street View API. This ends up returning a file-not-found error for any pano ID that isn't the example one outlined. Perhaps I'm using the wrong method of getting a panorama ID? I used the following example to extract the pano ID and plugged that into the URL with tileX = tileY = 0 and Zoom level of 1 to no avail.
I've also tried downloading separate 2D images taken at 90-degree angles but when I go to display them on the inside of a cube, the images are misaligned.
There's a tool called UnrealJS that I've been looking into in order to grab the panorama data and save it off, but my inexperience with Node.js and server-side JS has made this a very confusing, fruitless endeavor. Other programs I've looked into that allow you to extract these panoramic images use canvas tags to request the maps API and then save what Google's API writes to the canvas into a buffer. Is this the way to go? UnrealJS does support a bastardized version of HTML that I may be able to use - this, however, is less than ideal.
Streetview panoramas are divided into an equal grid from an equirectangular image. This article explains how to get the url of every tile. For zoom level 5 (the highest resolution) there are 26 by 13 tiles (each tile being 512x512). All that is left is to download every image and draw each one onto a large empty image to their respective place using their position on the grid.
Note: by doing this you will be breaking Google's terms of service
Related
I'm looking to see how I can either shrink tiles further down in Power BI than the drag features lets me, or a way to display multiple images from different API endpoints in the same tile.
For example here is a SonarQube image output separated to each tile as an IMG tile.
As you can see this takes up a ton of dead space and could be done in a much more compact way. Being able to paste all 4 images in one box would already help.
I did try to use the "Embed Code" option, but I could get an image to show since it's technically not an image (.jpg or something like that), but instead an endpoint (ex: http://server:9000/api/project_badges/measure?project=Project&metric=metric_name).
Any help is appreciated.
Sorry, answer was simple, I needed to include a p element and close it before each endpoint in the embedded tile code.
<p><img src="http://server:9000/api/project_badges/measure?project=Project_name&metric=metric_name1"></p>
Using Zurb Foundation framework. It has a component called Foundation Interchange to serve responsive images.
See https://foundation.zurb.com/sites/docs/interchange.html
While it does serve images based on viewport, it does not support lazy load and we want to lazy load some images using the Intersection Observer API.
See https://developers.google.com/web/fundamentals/performance/lazy-loading-guidance/images-and-video/
Objective:
When we choose to lazy load something and give an IMG the class "lazy", then Foundation Interchange serves a low res/small placeholder image. This part is easy. Then use Intersection Observer to simply change one folder in the path so it then points to the high res image. This is the harder part.
Important note:
Most techniques don’t work because we are loading responsive images so I can’t simply point to one image that varies depending on viewport.
We want to apply a class to any image and make it lazy load another image based on Intersection Observer. It will load a low res small image right away and then swap it out for another high res image later on.
Instead of using data-src like most solutions, we want to change the path to the image.
For example, assume the SRC is:
<img class=”lazy” src="assets/img1/test-blur2.jpg">
I want to have Observer watch and change the image path as follows:
<img class=”lazy” src="assets/img/test-blur2.jpg">
In other words, I want to look for images with class=lazy and delete the “1” after /img, and then show the updated image.
Thanks in advance for any tips
While I have been researching best practices and experimenting multiple options for an ongoing project(i.e. Unity3D iOS project in Vuforia with native integration, extracting frames with AVFoundation then passing the image through cloud-based image recognition), I have come to the conclusion that I would like to use ARkit, Vision Framework, and CoreML; let me explain.
I am wondering how I would be able to capture ARFrames, use the Vision Framework to detect and track a given object using a CoreML model.
Additionally, it would be nice to have a bounding box once the object is recognized with the ability to add an AR object upon a gesture touch but this is something that could be implemented after getting the solid project down.
This is undoubtedly possible, but I am unsure of how to pass the ARFrames to CoreML via Vision for processing.
Any ideas?
Update: Apple now has a sample code project that does some of these steps. Read on for those you still need to figure out yourself...
Just about all of the pieces are there for what you want to do... you mostly just need to put them together.
You obtain ARFrames either by periodically polling the ARSession for its currentFrame or by having them pushed to your session delegate. (If you're building your own renderer, that's ARSessionDelegate; if you're working with ARSCNView or ARSKView, their delegate callbacks refer to the view, so you can work back from there to the session to get the currentFrame that led to the callback.)
ARFrame provides the current capturedImage in the form of a CVPixelBuffer.
You pass images to Vision for processing using either the VNImageRequestHandler or VNSequenceRequestHandler class, both of which have methods that take a CVPixelBuffer as an input image to process.
You use the image request handler if you want to perform a request that uses a single image — like finding rectangles or QR codes or faces, or using a Core ML model to identify the image.
You use the sequence request handler to perform requests that involve analyzing changes between multiple images, like tracking an object's movement after you've identified it.
You can find general code for passing images to Vision + Core ML attached to the WWDC17 session on Vision, and if you watch that session the live demos also include passing CVPixelBuffers to Vision. (They get pixel buffers from AVCapture in that demo, but if you're getting buffers from ARKit the Vision part is the same.)
One sticking point you're likely to have is identifying/locating objects. Most "object recognition" models people use with Core ML + Vision (including those that Apple provides pre-converted versions of on their ML developer page) are scene classifiers. That is, they look at an image and say, "this is a picture of a (thing)," not something like "there is a (thing) in this picture, located at (bounding box)".
Vision provides easy API for dealing with classifiers — your request's results array is filled in with VNClassificationObservation objects that tell you what the scene is (or "probably is", with a confidence rating).
If you find or train a model that both identifies and locates objects — and for that part, I must stress, the ball is in your court — using Vision with it will result in VNCoreMLFeatureValueObservation objects. Those are sort of like arbitrary key-value pairs, so exactly how you identify an object from those depends on how you structure and label the outputs from your model.
If you're dealing with something that Vision already knows how to recognize, instead of using your own model — stuff like faces and QR codes — you can get the locations of those in the image frame with Vision's API.
If after locating an object in the 2D image, you want to display 3D content associated with it in AR (or display 2D content, but with said content positioned in 3D with ARKit), you'll need to hit test those 2D image points against the 3D world.
Once you get to this step, placing AR content with a hit test is something that's already pretty well covered elsewhere, both by Apple and the community.
I am working on a project with gesture recognition. Now I want to prepare a presentation in which I can only show images. I have a series of images defining a gesture, and I want to show them in a single image just like motion history images are shown in literature.
My question is simple, which functions in opencv can I use to make a motion history image using lets say 10 or more images defining the motion of hand.
As an example I have the following image, and I want to show hand's location (opacity directly dependent on time reference).
I tried using GIMP to merge layers with different opacity to do the same thing, however the output is not good.
You could use cv::updateMotionHistory
Actually OpenCV also demonstrates the usage in samples/c/motempl.c
I'm building a web cam application as my C++ project in my college. I am integrating QT (for GUI) and OpenCV (for image processing). My application will be a simple web cam app that will access the web cam, show/record videos, capture images and other stuffs.
Well, I also want to put in a feature to add cliparts to captured images, or the streaming video. While on my research, I found out that there is no way we can overlay two images using OpenCV. The best alternative I was able to find was to reconfigure the whole image to add the clipart into the original image making it a single image. You see, that's not going to work for me as I have to be able to move the clipart and resize or rotate the clipart in my canvas.
So, I was wondering if anybody could tell me how to achieve the effect I want most efficiently.
I would really appreciate your help. The deadline for the project submission is closing in and its a huge bump on the road to completion. PLEEEASE... RELP!!
If you just want to stick a logo onto the openCV image then you simply define a region of interest (roi) on the destination video image and copy the source image to this (the details vary with each version of opencv)
If you want the logo to be semi transparent - like a TV channel ID - then you can copy the image but loop over the pixels writing a destination that is source_pixel/2 + dest_pixel/2;