Google Speech to Text: InvalidArgument: 400 Must use single channel (mono) audio, but WAV header indicates 1 channels - google-cloud-platform

I am using the Google Cloud Platform to convert some audio into text files through the Google Speech-to-Text API. I keep getting the error: google.api_core.exceptions.InvalidArgument: 400 Must use single channel (mono) audio, but WAV header indicates 1 channels.
Here is my code:
config_wave_enhanced = speech.types.RecognitionConfig(
#sample_rate_hertz=44100,
encoding = 'LINEAR16',
enable_automatic_punctuation=True,
language_code='en-US',
#use_enhanched=True,
model='video',
enable_separate_recognition_per_channel = True,
audio_channel_count = 2
)
operation = speech_client.long_running_recognize(
config=config_wave_enhanced,
audio=long_audi_wave
)
response = str(operation.result(timeout=90))
Can anyone help me solve this error? I'm going crazy here.

Setting audio_channel_count = 1 might help.

Convert your audio to 1-channel. You can do this using command line ffmpeg -i stereo.wav -ac 1 mono.wav. Also set audio_channel_count = 1 as Christian Adib mentioned.

Related

Google Speech to Text Error: "Invalid recognition 'config': bad encoding.." for an MP3 file

I'm recording audio in a react web app using the "mic-recorder-to-mp3" node package.
I've used MediaInfo to look at the audio files produced using this library (here's a sample file) and it shows the following information:
So it doesn't appear to be corrupted or anything...however, when I run Google's Speech to Text API with the following code, I get the error: "Invalid recognition 'config': bad encoding.."
const client = new speech.SpeechClient();
//configure the request:
const config = {
enableWordTimeOffsets: true,
sampleRateHertz: 48000,
encoding: 'MP3',
languageCode: 'en-US',
};
const audio = {
content: fs.readFileSync(filename).toString('base64'),
};
const request = {
config: config,
audio: audio,
};
// Detects speech in the audio file
const [response] = await client.recognize(request);
I can't understand what's going wrong here...any help would be appreciated!
I was able to reproduce the issue, seems that the encoding used is the root cause, I used the gcloud ml speech recognize command and I got no responses:
gcloud ml speech recognize gs://MY_BUCKET/audioClip.mp3 --language-code=en-US --encoding=linear16 --sample-rate=48000
{}
After that, I changed the encoding of the file:
ffmpeg -i audioClip.mp3 audioClip.wav
Then I tried again and voilà:
gcloud ml speech recognize gs://MY_BUCKET/audioClip.wav --language-code=en-US --encoding=linear16 --sample-rate=48000
{
"results": [
{
"alternatives": [
{
"confidence": 0.7809482,
"transcript": "testing testing 1 2 3"
}
]
}
]
}
Please consider that according to this documentation MP3 encoding is a Beta feature and only available in v1p1beta1. So, you should consider to convert your file before to send it to the Speech to Text API.

Why can only download the first episode video on bilibili with youtube-dl?

I can download the first episode of a series.
yutube-dl https://www.bilibili.com/video/av90163846?p=1
Now I want to download all episodes of the series.
for i in $(seq 1 55)
do
yutube-dl https://www.bilibili.com/video/av90163846?p=$i
done
All other episodes except the first can't be downloaded ,both of them contains same error info such as below:
[BiliBili] 90163846: Downloading webpage
[BiliBili] 90163846: Downloading video info page
[download] 【合集300集全】地道美音 美国中小学教学 自然科学 社会常识-90163846.flv has already been downloaded
Please have a try and check what happens,how to fix then?
#Christos Lytras,strange thing happen with your code:
for i in $(seq 1 55)
do
youtube-dl https://www.bilibili.com/video/av90163846?p=$i -o "%(title)s-%(id)s-$i.%(ext)s"
done
It surely can download video on bilibili,but all of downloaded video have different name and same content,all the content are the same as the first episode,have a try and check ,you will find that fact.
This error occurs because youtube-dl ignores URI parameters after ? for the filename, so the next file it tries to download has the same name with the previous one and it fails because a file already exists with that name. The solution is to use the --output template filesystem option to set a filename which it'll have an index in its name using the variable i.
Filesystem Options
-o, --output TEMPLATE Output filename template, see the "OUTPUT
TEMPLATE" for all the info
OUTPUT TEMPLATE
The -o option allows users to indicate a
template for the output file names.
The basic usage is not to set any template arguments when downloading
a single file, like in youtube-dl -o funny_video.flv "https://some/video". However, it may contain special sequences that
will be replaced when downloading each video. The special sequences
may be formatted according to python string formatting operations. For
example, %(NAME)s or %(NAME)05d. To clarify, that is a percent symbol
followed by a name in parentheses, followed by formatting operations.
Allowed names along with sequence type are:
id (string): Video identifier
title (string): Video title
url (string): Video URL
ext (string): Video filename extension
...
For your case, to use the i in the output filename, you can use something like this:
for i in $(seq 1 55)
do
youtube-dl https://www.bilibili.com/video/av90163846?p=$i -o "%(title)s-%(id)s-$i.%(ext)s"
done
which will use the title the id the i variable for indexing and the ext for the video extension.
You can check the Output Template variables for more options defining the filename.
UPDATE
Apparently, bilibili.com has some Javascript involved to setup the video player and fetch the video files. There is no way so you can extract the whole playlist using youtube-dl. I suggest you use Lux which supports Bilibili playlists out of the box. It has installers for all major operating systems and you can use it like this to download the whole playlist:
lux -p https://www.bilibili.com/video/av90163846
of if you want to download only until 55 video, you can use -end 55 cli option like this:
lux -end 55 -p https://www.bilibili.com/video/av90163846
You can use the -start, -end or -items option to specify the download
range of the list:
-start
Playlist video to start at (default 1)
-end
Playlist video to end at
-items
Playlist video items to download. Separated by commas like: 1,5,6,8-10
For bilibili playlists only:
-eto
File name of each bilibili episode doesn't include the playlist title
If you want to only get information of a playlist without downloading files, then use the -i command line option like this:
lux -i -p https://www.bilibili.com/video/av90163846
will output something like this:
Site: 哔哩哔哩 bilibili.com
Title: 【合集300集全】地道美音 美国中小学教学 自然科学 社会常识 P1 【001】Parts of Plants
Type: video
Streams: # All available quality
[64] -------------------
Quality: 高清 720P
Size: 308.24 MiB (323215935 Bytes)
# download with: lux -f 64 ...
[32] -------------------
Quality: 清晰 480P
Size: 201.57 MiB (211361230 Bytes)
# download with: lux -f 32 ...
[16] -------------------
Quality: 流畅 360P
Size: 124.75 MiB (130809508 Bytes)
# download with: lux -f 16 ...
Site: 哔哩哔哩 bilibili.com
Title: 【合集300集全】地道美音 美国中小学教学 自然科学 社会常识 P2 【002】Life Cycle of a Plant
Type: video
Streams: # All available quality
[64] -------------------
Quality: 高清 720P
Size: 227.75 MiB (238809781 Bytes)
# download with: lux -f 64 ...
[32] -------------------
Quality: 清晰 480P
Size: 148.96 MiB (156191413 Bytes)
# download with: lux -f 32 ...
[16] -------------------
Quality: 流畅 360P
Size: 94.82 MiB (99425641 Bytes)
# download with: lux -f 16 ...

ExportDicomData request of Google Cloud Healthcare API on GitHub tutorials never finish

I'm trying AutoML Vision of ML Codelabs on Cloud Healthcare API GitHub tutorials.
https://github.com/GoogleCloudPlatform/healthcare/blob/master/imaging/ml_codelab/breast_density_auto_ml.ipynb
I run the Export DICOM data cell code of Convert DICOM to JPEG section and the request as well as all the premise cell code succeeded.
But waiting for operation completion is timed out and never finish.
(ExportDicomData request status on Dataset page stays "Running" over the day. I did many times but all the requests were stacked staying "Running". A few times I tried to do from scratch and the results were same.)
I did so far:
1) Remove "output_config" since INVALID ARGUMENT error occurs.
https://github.com/GoogleCloudPlatform/healthcare/issues/133
2) Enable Cloud Resource Manager API since it is needed.
This is the cell code.
# Path to export DICOM data.
dicom_store_url = os.path.join(HEALTHCARE_API_URL, 'projects', project_id, 'locations', location, 'datasets', dataset_id, 'dicomStores', dicom_store_id)
path = dicom_store_url + ":export"
# Headers (send request in JSON format).
headers = {'Content-Type': 'application/json'}
# Body (encoded in JSON format).
# output_config = {'output_config': {'gcs_destination': {'uri_prefix': jpeg_folder, 'mime_type': 'image/jpeg; transfer-syntax=1.2.840.10008.1.2.4.50'}}}
output_config = {'gcs_destination': {'uri_prefix': jpeg_folder, 'mime_type': 'image/jpeg; transfer-syntax=1.2.840.10008.1.2.4.50'}}
body = json.dumps(output_config)
resp, content = http.request(path, method='POST', headers=headers, body=body)
assert resp.status == 200, 'error exporting to JPEG, code: {0}, response: {1}'.format(resp.status, content)
print('Full response:\n{0}'.format(content))
# Record operation_name so we can poll for it later.
response = json.loads(content)
operation_name = response['name']
This is the result of waiting.
Waiting for operation completion...
Full response:
{
"name": "projects/my-datalab-tutorials/locations/us-central1/datasets/sample-dataset/operations/18300485449992372225",
"metadata": {
"#type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
"apiMethodName": "google.cloud.healthcare.v1beta1.dicom.DicomService.ExportDicomData",
"createTime": "2019-08-18T10:37:49.809136Z"
}
}
AssertionErrorTraceback (most recent call last)
<ipython-input-18-1a57fd38ea96> in <module>()
21 timeout = time.time() + 10*60 # Wait up to 10 minutes.
22 path = os.path.join(HEALTHCARE_API_URL, operation_name)
---> 23 _ = wait_for_operation_completion(path, timeout)
<ipython-input-18-1a57fd38ea96> in wait_for_operation_completion(path, timeout)
15
16 print('Full response:\n{0}'.format(content))
---> 17 assert success, "operation did not complete successfully in time limit"
18 print('Success!')
19 return response
AssertionError: operation did not complete successfully in time limit
API Version is v1beta1.
I was wondering if somebody has any suggestion.
Thank you.
After several times kept trying and stayed running one night, it finally succeeded. I don't know why.
There was a recent update to the codelabs. The error message is due to the timeout in the codelab and not the actual operation. This has been addressed in the update. Please let me know if you are still running into any issues!

APNS issue with django

I'm using the following project for enabling APNS in my project:
https://github.com/stephenmuss/django-ios-notifications
I'm able to send and receive push notifications on my production app fine, but the sandbox apns is having strange issues which i'm not able to solve. It's constantly not connecting to the push service. When I do manually the _connect() on the APNService or FeedbackService classes, I get the following error:
File "/Users/MyUser/git/prod/django/ios_notifications/models.py", line 56, in _connect
self.connection.do_handshake()
Error: [('SSL routines', 'SSL3_READ_BYTES', 'sslv3 alert handshake failure')]
I tried recreating the APN certificate a number of times and constantly get the same error. Is there anything else i'm missing?
I'm using the endpoints gateway.push.apple.com and gateway.sandbox.push.apple.com for connecting to the service. Is there anything else I should look into for this? I have read the following:
Apns php error "Failed to connect to APNS: 110 Connection timed out."
Converting PKCS#12 certificate into PEM using OpenSSL
Error Using PHP for iPhone APNS
Turns out Apple changed ssl context from SSL3 to TLSv1 in development. They will do this in Production eventually (not sure when). The following link shows my pull request which was accepted into the above project:
https://github.com/stephenmuss/django-ios-notifications/commit/879d589c032b935ab2921b099fd3286440bc174e
Basically, use OpenSSL.SSL.TLSv1_METHOD if you're using python or something similar in other languages.
Although OpenSSL.SSL.SSLv3_METHOD works in production, it may not work in the near future. OpenSSL.SSL.TLSv1_METHOD works in production and development.
UPDATE
Apple will remove SSL 3.0 support in production on October 29th, 2014 due to the poodle flaw.
https://developer.apple.com/news/?id=10222014a
I have worked on APN using python-django, for this you need three things URL, PORT and Certificate provided by Apple for authentication.
views.py
import socket, ssl, json, struct
theCertfile = '/tmp/abc.cert' ## absolute path where certificate file is placed.
ios_url = 'gateway.push.apple.com'
ios_port = 2195
deviceToken = '3234t54tgwg34g' ## ios device token to which you want to send notification
def ios_push(msg, theCertfile, ios_url, ios_port, deviceToken):
thePayLoad = {
'aps': {
'alert':msg,
'sound':'default',
'badge':0,
},
}
theHost = ( ios_url, ios_port )
data = json.dumps( thePayLoad )
deviceToken = deviceToken.replace(' ','')
byteToken = deviceToken.decode('hex') # Python 2
theFormat = '!BH32sH%ds' % len(data)
theNotification = struct.pack( theFormat, 0, 32, byteToken, len(data), data )
# Create our connection using the certfile saved locally
ssl_sock = ssl.wrap_socket( socket.socket( socket.AF_INET, socket.SOCK_STREAM ), certfile = theCertfile )
ssl_sock.connect( theHost )
# Write out our data
ssl_sock.write( theNotification )
# Close the connection -- apple would prefer that we keep
# a connection open and push data as needed.
ssl_sock.close()
Hopefully this would work for you.

'Cannot parse input stream' error when updating defects in Rally via pyral

I am using the Python Toolkit for Rally REST API to update defects on our Rally server. I have confirmed that I am able to make contact with the server and authenticate fine by getting a list of current defects. I am running into issues with updating them. I am using Python 2.7.3 with pyral 0.9.1 and requests 0.13.3.
Also, I am passing 'verify=False' to the Rally() call and have made appropriate chages to the
restapi module to compensate for this.
Here is my test code:
import sys
from pyral import Rally, rallySettings
server = "rallydev.server1.com"
user = "user#mycompany.com"
password = "trial"
workspace = "trialWorkspace"
project = "Testing Project"
defectID = "DE192"
rally = Rally(server, user, password, workspace=workspace,
project=project, verify=False)
defect_data = { "FormattedID" : defectID,
"State" : "Closed"
}
try:
defect = rally.update('Defect', defect_data)
except Exception, details:
sys.stderr.write('ERROR: %s \n' % details)
sys.exit(1)
print "Defect %s updated" % defect.FormattedID
When I run the script:
[temp]$ ./updefect.py
ERROR: Unable to update the Defect
If I change the code in the RallyRESTResponse function to print out the value of self.errors when found (line 164 of rallyresp.py), I get this output:
[temp]$ ./updefect.py
[u"Cannot parse input stream due to I/O error as JSON document: Parse error: expected '{' but saw '\uffff' [ chars read = >>>\uffff<<< ]"]
ERROR: Unable to update the Defect
I did find another question that sounds like it might possibly be related to mine here:
App SDK: Erorr parsing input stream when running query
Can you provide any assistance?
Pairing Michael's observation regarding the GZIP encoding with that of another astute Rally customer working a Support case on the issue - it appears that some versions of the requests module will default to GZIP compression if the content-type is not specifically defined.
The fix is to set content-type to application/json in the REST Headers section of pyral's config.py:
RALLY_REST_HEADERS = \
{
'X-RallyIntegrationName' : 'Python toolkit for Rally REST API',
'X-RallyIntegrationVendor' : 'Rally Software Development',
'X-RallyIntegrationVersion' : '%s.%s.%s' % __version__,
'X-RallyIntegrationLibrary' : 'pyral-%s.%s.%s' % __version__,
'X-RallyIntegrationPlatform' : 'Python %s' % platform.python_version(),
'X-RallyIntegrationOS' : platform.platform(),
'User-Agent' : 'Pyral Rally WebServices Agent',
'Content-Type' : 'application/json',
}
What you are seeing is probably not related to the Python 2.7.3 / requests 0.13.3 versions being used. The error message you saw has also been reported using the Javascript based App SDK and .NET Toolkit for Rally (2 separate reports here on SO) and at least one other person using Python 2.6.6 and requests 0.9.2. It appears that the error verbiage is being generated on the Rally WSAPI back-end. Current assessment by fellow Rally'ers is that it is an encoding related issue. The question is where the encoding issue originates.
I have yet to be able to repro this issue, having tried with several versions of Python (2.6.x and 2.7.x), several versions of requests and on Linux, MacOS and Win7.
As you seem to be pretty comfortable with diving in to the code and running in debug mode, one avenue to try is to capture the defective POST URL and POST data and attempting the update via a browser based REST client like 'Simple REST Client' or Poster and observing if you get the same error message in the WSAPI response.
I'm seeing similar behavior with pyral while trying to add an attachment to a defect.
With debugging and logging on I see this request on stdout:
2012-07-20T15:11:24.855212 PUT https://rally1.rallydev.com/slm/webservice/1.30/attachmentcontent/create.js?workspace=workspace/123456789
Then the json in the logfile:
2012-07-20 15:11:24.854 PUT attachmentcontent/create.js?workspace=workspace/123456789
{"AttachmentContent": {"Content": "iVBORw0KGgoAAAANSUhEUgAABBQAAAJrCAIAAADf2VflAAAXOWlDQ...
Then this in the logfile (after a bit of fighting with restapi.py to get around the unicode error):
2012-07-20 15:11:25.260 404 Cannot parse input stream due to I/O error as JSON document: Parse error: expected '{' but saw '?' [ chars read = >>>?<<< ]
The notable thing there is the 404 error code. Also, the "Cannot parse input stream..." error message is not coming from pyral, it's coming from Rally's server. So pyral is sending Rally something Rally can't understand.
I also logged the response headers, which may be a clue:
{'rallyrequestid': 'qs-app-03ml3akfhdpjk7c430otjv50ak.qs-app-0387404259', 'content-encoding': 'gzip', 'transfer-encoding': 'chunked', 'expires': 'Fri, 20 Jul 2012 19:18:35 GMT', 'vary': 'Accept-Encoding', 'cache-control': 'no-cache,no-store,max-age=0,must-revalidate', 'date': 'Fri, 20 Jul 2012 19:18:36 GMT', 'p3p': 'CP="NON DSP COR CURa PSAa PSDa OUR NOR BUS PUR COM NAV STA"', 'content-type': 'text/javascript; charset=utf-8'}
Note there the 'content-encoding': 'gzip'. I suspect the requests module (I'm using 0.13.3 in Macos Python 2.6) is gzip encoding its PUT request but the Rally API server is not properly decoding that.