API requires incorrect Web-safe Base64 padding - google-admin-sdk

Summary - Currently the Directory API User Photo's photo data (encoded as web-safe Base64) handles padding in a way that contradicts the documentation; padding should be converted from = to . but instead requires =. Looking for clarification of if this is intended or not.
Details - I have been using the Google API to interface with the User Photos - being able to retrieve and update. The documentation is clear as to how the Web-Safe Base64 format for the photo data needs to be presented:
For padding, the period (.) character is used instead of the RFC-4648 baseURL definition which uses the equals sign (=) for padding. This is done to simplify URL-parsing.
However, recently this has stopped working. I'm unsure of exactly when this happened. (Edit: Based on similar comments from years ago that I'm finding, this may have never worked and I just never happened to test a photo that encoded into something with padding.)
To test this I downloaded an existing image and re-uploaded it, and got the error Invalid value for ByteString. If I intercept the Base64 being returned and pass that same data back directly, I get no error.
The issue turned out to be the padding - in the documentation it states the = equals padding needs to be replaced with a . period. My example Base64 ended with two padding characters, which were turned into two periods as expected (and this gives the error). If I instead leave the padding as the = equals, it works no problem.
Turns out the Base64 being returned from Google when you retrieve a user photo also has the padding with = equals characters, which seems to clearly contradict the documentation. I have also confirmed this through the Try It Now methods via the web, so it's not language or API Client-specific.
So, did the process change and the documentation (last updated Feb 26th, 2015) isn't updated? Is this a permanent change or a bug?
Edit - According to some other posts it looks like this has been a longstanding issue, I may have never run in to an image before that ended up with padding. Point stands - is the documentation accurate, or do I need to adjust for this?
Edit 2 - All signs point to this being either a bug or bad documentation. Either way, I was unable to find any issues in their tracking for this, so I have opened an issue for it. If I get official word in any case I will [try to remember to] come back and provide it as an answer.

Related

Did Google recently change UI to grant public access to bucket objects in Google Cloud Platform?

Ok, I've been using Google Cloud Platform for some video files
that are are viewable from a few web pages I built. I started this two or three years ago, and I have loved it.
But, now it appears they broke it, without warning/telling us.
So, in the platform's console, yesterday (for first time since a month or two ago), I uploaded another video...that part went fine. But, when it came time to click on the checkbox to grant public access, the checkbox is now GONE. (The only part of the UI that looks NEW,
is the column labeled 'public access'. Instead of just a check-box to toggle on or off, now there's a yellow-triangle and an oval-shaped symbol. Once or twice, I was able to get a popup to appear saying 'edit permission', but that quickly led into the weeds.)
After half an hour or so, I finally thought to call platform support, and explained my problem to a guy (with just enough Australian accent to cause me to have to ask for repeats quite a bit...sigh).
So, they logged me a case# and I suggested I was headed to bed, and asked that we now use email (rather than the phone) to continue. Just before bed, I got the case#, and a query about whether it was ok for them to 'change my console'. I replied to the email, saying yes, and went to bed.)
So that was last nite. This morning, re-reading their email, it seems to say that it could be 3 or 4 days, before a more technical person will contact me.
Some re-reading their platform-console docs, I'm now GUESSING that maybe they just nuked the public-access checkbox, and that now I'm supposed to spend hours (days?) taking a short-course on IAM-permmissions, and learn some new long-winded method.
(This whole mess could have been avoided, if they'd just emailed us an informational warning of this UI-change, with some new 5-step short list or tutorial of how to learn to use their 'new, much more complicated,
way to specify public-access'. From where I sit, this change is equivalent to Microsoft saying 'instead of that checkbox, you'll need to learn to make registry edits...see our platform docs on how to do that.)
Right now, I have more than half-a-mind, to seriously consider bailing out of Google's cloud storage, and consider switching to one of the others. But, I'm not quite ready yet, to make that jump (from the frying-pan into the fire?). :^)
Anyone else been down this road? What meeting did I miss? Is there a quicker way out of my dilemma, than just waiting for Google-support to get back to me?
It looks like the change you mention was introduced on July 18th. I’m not sure why, but judging by the change description, it looks like it is aimed to avoid accidentally making sensitive information public: “Objects can no longer be made public through one-click actions”.
You can find the procedure to make a single object public here. It can be achieved through the Console and won't take you more than a few minutes. Once the object is shared publicly, you can use the icon in the “public access” column to get the URL for the object.
You can also make all the content of a bucket public using a similar approach.
When you upload your objects into a bucket, you can upload with ACL as publicRead
and all your objects will have public URL.
public async Task UploadObjectAsync(string bucketName, string objectName, Stream source, string contentType = "image/jpeg")
{
var storage = StorageClient.Create();
await storage.UploadObjectAsync(bucketName, objectName, contentType, source, new UploadObjectOptions()
{
PredefinedAcl = PredefinedObjectAcl.PublicRead
});
}
As I suspected. (I still wonder if they even considered sending an email to each registered/existing customer.)
Ok, yes, (finally, after some practices), this solves it! Thx for those two answers.
(But in my view, their UI-change is still a work-in-progress) So, I have a SUGGESTION for ya, Google. Once one is into the permissions-edit-dialog, and remembers to do an 'add', there's are the 3 fields. The first and third are fine...drop-downs with choices. But that middle entry needs work...how about doing something like an auto-guess-ahead...initialize the field to a suggested value of 'allUsers', so we don't have to remember what to type and how to spell it, or something along those lines.
EDIT: [Actually, it ought to be possible to make that field a drop-down-list choice, with 'allUsers' as one suggested value, and a second value as a text-entry (for specific user-names, etc).]
Unfortunately, 8 Ball Pool it is not possible to list files Google Hangouts without access to the Omegle bucket that contains them. This is due to the current design of the library, which requires that the bucket is loaded before listing its files.

Amazon CloudSearch - documents not deleted from index

I have a problem deleting documents from Amazon CloudSearch.
When I send document for deletion I receive response
{"status": "success", "adds": 0, "deletes": 5}
And then the video stays in the index with all fields reset to their default values and not deleted.
The documentation is not clear if this is the normal behaviour or a bug.
Any one else experienced this?
This surprised me too but appears to be normal behavior. The 'deleted' documents aren't searchable anymore since their fields are all null so they shouldn't cause any problems.
The problem I have with this is that they can be returned if you search for something like "-zomgwtfbbq", since they don't contain the term "zomgwtfbbq".
It is also confusing since it makes your dashboard show one count (the "searchable" documents) but if you run a test search for -zomgwtfbbq (what I have been using as a proxy for "get all documents"), you get a different number. Took me a while to figure out why.
Despite what they say about setting the version to max uint32 "permanently removing" the document, it will still be there. The problem is that they consider these documents unsearchable, but they're not.
Are you specifying the version number when you delete the document?
When deleting documents, note that deleting version max(uint32_t) will permanently remove the document from your domain. Because it is not possible to specify a higher version number, there is no way to add a later version of the document.
http://docs.aws.amazon.com/cloudsearch/latest/developerguide/versioning.html

Mediawiki mass user delete/merge/block

I have 500 or so spambots and about 5 actual registered users on my wiki. I have used nuke to delete their pages but they just keep reposting. I have spambot registration under control using reCaptcha. Now, I just need a way to delete/block/merge about 500 users at once.
You could just delete the accounts from the user table manually, or at least disable their authentication info with a query such as:
UPDATE /*_*/user SET
user_password = '',
user_newpassword = '',
user_email = '',
user_token = ''
WHERE
/* condition to select the users you want to nuke */
(Replace /*_*/ with your $wgDBprefix, if any. Oh, and do make a backup first.)
Wiping out the user_password and user_newpassword fields prevents the user from logging in. Also wiping out user_email prevents them from requesting a new password via email, and wiping out user_token drops any active sessions they may have.
Update: Since I first posted this, I've had further experience of cleaning up large numbers of spam users and content from a MediaWiki installation. I've documented the method I used (which basically involves first deleting the users from the database, then wiping out up all the now-orphaned revisions, and finally running rebuildall.php to fix the link tables) in this answer on Webmasters Stack Exchange.
Alternatively, you might also find Extension:RegexBlock useful:
"RegexBlock is an extension that adds special page with the interface for blocking, viewing and unblocking user names and IP addresses using regular expressions."
There are risks involved in applying the solution in the accepted answer. The approach may damage your database! It incompletely removes users, doing nothing to preserve referential integrity, and will almost certainly cause display errors.
Here a much better solution is presented (a prerequisite is that you have installed the User merge extension):
I have a little awkward way to accomplish the bulk merge through a
work-around. Hope someone would find it useful! (Must have a little
string concatenation skills in spreadsheets; or one may use a python
or similar script; or use a text editor with bulk replacement
features)
Prepare a list of all SPAMuserIDs, store them in a spreadsheet or textfile. The list may be
prepared from the user creation logs. If you do have the
dB access, the Wiki_user table can be imported into a local list.
The post method used for submitting the Merge & Delete User form (by clicking the button) should be converted to a get method. This
will get us a long URL. See the second comment (by Matthew Simoneau)
dated 13/Jan/2009) at
http://www.mathworks.com/matlabcentral/newsreader/view_thread/242300
for the method.
The resulting URL string should be something like below:
http: //(Your Wiki domain)/Special:UserMerge?olduser=(OldUserNameHere)&newuser=(NewUserNameHere)&deleteuser=1&token=0d30d8b4033a9a523b9574ccf73abad8%2B\
Now, divide this URL into four sections:
A: http: //(Your Wiki domain)/Special:UserMerge?olduser=
B: (OldUserNameHere)
C: &newuser=(NewUserNameHere)&deleteuser=1
D: &token=0d30d8b4033a9a523b9574ccf73abad8%2B\
Now using a text editor or spreadsheet, prefix each spam userIDs with part A and Suffix each with Part C and D. Part C will include the
NewUser(which is a specially created single dummy userID). The Part D,
the Token string is a session-dependent token that will be changed per
user per session. So you will need to get a new token every time a new
session/batch of work is required.
With the above step, you should get a long list of URLs, each good to do a Merge&Delete operation for one user. We can now create a
simple HTML file, view it and use a batch downloader like DownThemAll
in Firefox.
Add two more pieces " Linktext" to each line at
beginning and end. Also add at top and at
bottom and save the file as (for eg:) userlist.html
Open the file in Firefox, use DownThemAll add-on and download all the files! Effectively, you are visiting the Merge&Delete page for
each user and clicking the button!
Although this might look a lengthy and tricky job at first, once you
follow this method, you can remove tens of thousands of users without
much manual efforts.
You can verify if the operation is going well by opening some of the
downloaded html files (or by looking through the recent changes in
another window).
One advantage is that it does not directly edit the
MySQL pages. Nor does it require direct database access.
I did a bit of rewriting to the quoted text, since the original text contains some flaws.

Issues with raw_post_data decoding in Django

I have stumbled on a strange issue that I can't resolve:
In my Django app there is a method which gets hit by a POST from a java applet, which sends it a JSON object. Django method parses it like so:
req = json.loads(request.raw_post_data)
and based on the results returns a value. I haven't written this code, but yesterday I was sent to investigate an error triggered in this method. It was saying there was "ValueError: Expecting property name: line 1 column 1 (char 1)".
What I discovered is that my raw post data looks like this:
{#012#011"ImmutableMachineFactors": #012#011{#012#011#011"machineName": "lukka",#012#011#011"osName": "MacOS"}}
The type of it was string, however, my attempts to replace these weird characters with spaces or nothing failed. It would just ignore the sub() command. I know that raw_post_data returns a bytestring, but when I tried to convert it to a regular string using:
mystring.decode('utf-8')
it did add the u'' notation, but didn't remove those weird characters. Stranger still, in many cases (on my personal machine), Django would happily convert this kind of data into JSON, it only fails sometimes, which led me to believe that the JSON which triggered the error was malformed, but when I would strip out all the #011 and #012 characters, it parsed perfectly.
My questions are:
1) What are those crazy things? (#011, #012). I tried to google around, but these are very common things to find in a search, so I couldn't find anything relevant.
2) How can I turn this bytestring into a regular string so that I can replace those characters? Or is it the wring way to approach this problem?
Thanks!
Luka
This may be way too late to help, but since QueryDict instances (request.POST or request.DATA) are immutable, it's reasonable to expect that request.raw_post_data is also immutable. You'd have to make a copy before changing it.

Externally linked images - How to prevent cross site scripting

On my site, I want to allow users to add reference to images which are hosted anywhere on the internet. These images can then be seen by all users of my site. As far as I understand, this could open the risk of cross site scripting, as in the following scenario:
User A adds a link to a gif which he hosts on his own webserver. This webserver is configured in such a way, that it returns javascript instead of the image.
User B opens the page containg the image. Instead of seeing the image, javascript is executed.
My current security messures are currently such, that both on save and open, all content is encoded.
I am using asp.net(c#) on the server and a lot of jquery on the client to build ui elements, including the generation of image tags.
Is this fear of mine correct? Am I missing any other important security loopholes here? And most important of all, how do I prevent this attack? The only secure way I can think of right now, is to webrequest the image url on the server and check if it contains anything else than binary data...
Checking the file is indeed an image won't help. An attacker could return one thing when the server requests and another when a potential victim makes the same request.
Having said that, as long as you restrict the URL to only ever be printed inside the src attribute of an img tag, then you have a CSRF flaw, but not an XSS one.
Someone could for instance create an "image" URL along the lines of:
http://yoursite.com/admin/?action=create_user&un=bob&pw=alice
Or, more realistically but more annoyingly; http://yoursite.com/logout/
If all sensitive actions (logging out, editing profiles, creating posts, changing language/theme) have tokens, then an attack vector like this wouldn't give the user any benefit.
But going back to your question; unless there's some current browser bug I can't think of you won't have XSS. Oh, remember to ensure their image URL doesn't include odd characters. ie: an image URL of "><script>alert(1)</script><!-- may obviously have bad effects. I presumed you know to escape that.
Your approach to security is incorrect.
Don't approach the topic as "I have a user input, so how can I prevent XSS". Rather approach it like it this: "I have user input - it should be restrictive as possible - i.e. allowing nothing through". Then based on that allow only what's absolutely essential - plain-text strings thoroughly sanitized to prevent anything but a URL, and the specific, necessary characters for URLS only. Then Once it is sanitized I should only allow images. Testing for that is hard because it can be easily tricked. However, it should still be tested for. Then because you're using an input field you should make sure that everything from javascript scripts and escape characters, HTML, XML and SQL injections are all converted to plaintext and rendered harmless and useless. Consider your users as being both idiots and hackers - that they'll input everything incorrectly and try to hack something into your input space.
Aside from that you may run into som legal issues with regard to copyright. Copyrighted images generally may not be used on other people's sites without the copyright owner's consent and permission - usually obtained in writing (or email). So allowing users the opportunity to simply lift images from a site could run the risk of allowing them to take copyrighted material and reposting it on your site without permission which is illegal. Some sites are okay with citing the source, others require a fee to be paid, and others will sue you and bring your whole domain down for copyright infringement.