My Team and I are building a speech-to-text application for a specific purpose.
The frontend is in Flutter and the backend is in Django
I am using the flutter_sound package for Flutter and the only Codec it supports for recording audio is Codec.aacADTS and I am able to save the file using .aac or .adts
On the backend (Django), we're using the Speech Recognition Library and when using the .aac or .adts file with this, it gives an error: ValueError("Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format")
We tried using another .wav file and speech recognition works.
So, do I need to convert the .aac/.adts file to a .wav file?
How should I do it, and should it be on the frontend or the backend? Any library/code snippets to help me with that?
Frontend Code (Flutter)
Future startRecord() async {
setState(() {
isRecording = true;
});
Directory? dir = await getExternalStorageDirectory();
await recorder.startRecorder(
toFile: "${dir!.path}/audio.aac", codec: Codec.aacADTS);
}
Future stopRecorder() async {
final filePath = await recorder.stopRecorder();
final file = File(filePath!);
uploadAudio(file);
print('Recorded file path: $filePath');
setState(() {
isRecording = false;
audioFile = file;
});
}
Backend Code (Django)
#api_view(['GET'])
def speech_to_text(request):
conn = mongo_conn()
audio_file_ID = request.GET['AudioFileID']
audio_file = download_audio(conn, audio_file_ID)
r = sr.Recognizer()
with sr.AudioData(audio_file) as source:
# listen for the data (load audio to memory)
audio_data = r.record(source)
# recognize (convert from speech to text)
text = r.recognize_google(audio_data)
print(text)
return JsonResponse({'text': text}, safe=False)
Related
I'm using Google's Speech-to-Text API in node js. It returns the recognition of the first few words, but then ignores the remainder of the audio file. The cut-off point is around 5-7 seconds into any uploaded file.
I've tried synchronous speech recognition for shorter audio files.
(Example using an MP3 file shown below)
filename = './TEST/test.mp3';
const client = new speech.SpeechClient();
//configure the request:
const config = {
enableWordTimeOffsets: true,
sampleRateHertz: 44100,
encoding: 'MP3',
languageCode: 'en-US',
};
const audio = {
content: fs.readFileSync(filename).toString('base64'),
};
const request = {
config: config,
audio: audio,
};
// Detects speech in the audio file
const [response] = await client.recognize(request);
And I've also tried asynchronous recognition for longer audio files
(Example using a WAV file shown below)
filename = './TEST/test.wav';
const client = new speech.SpeechClient();
//configure the request:
const config = {
enableWordTimeOffsets: true,
languageCode: 'en-US',
};
const audio = {
content: fs.readFileSync(filename).toString('base64'),
};
const request = {
config: config,
audio: audio,
};
//Do a longRunningRecognize request
const [operation] = await client.longRunningRecognize(request);
const [response] = await operation.promise();
I've tried each of these implementations with both WAV files and MP3. The result is always exactly the same: good recognition for the first 5 seconds, then nothing at all.
Any help would be greatly appreciated!
#Ricco D was absolutely right, I was printing the results incorrectly...
When you try to transcribe longer files, Google Speech to Text will break up your transcription based on when it detects pauses in speech.
Your response.results[] array will have multiple entries that you need to loop through to print the full transcript.
See the docs for more detail:
https://cloud.google.com/speech-to-text/docs/basics#responses
I'm trying to download a .gz file from a django server (Python 3.7) using Ajax post request. This is the minimal django view function and Ajax function to request download on client, compress a folder and send it (server) and receive the data on the client:
from pathlib import Path
def downloadfile(request):
folder = Path().home().joinpath('workspace')
tar_path = Path().home().joinpath('workspace.gz')
tar = tarfile.open(tar_path.as_posix(), 'w:gz')
tar.add(folder.as_posix(), arcname='workspace')
tar.close()
try:
with open(tar_path.as_posix(), 'rb') as f:
file_data = f.read()
response = HttpResponse(file_data, content_type='application/gzip')
response['Content-Disposition'] = 'attachment; filename="workspace.gz"'
except IOError:
response = HttpResponse('File not exist')
return response
This is the Ajax function on the client side:
$(function () {
$('#downloadfile').submit(function () {
$.ajax({
type: 'POST',
url: 'downloadfile',
success: function(response){
download(response,'workspace.gz', 'application/gzip');
}
});
return false;
});
});
function download(content, filename, contentType)
{
var a = document.createElement('a');
var blob = new Blob([content], {'type':contentType});
a.href = window.URL.createObjectURL(blob);
a.download = filename;
a.click();
}
A sample gzipped folder that is 36.5 KB will be inflated to 66.1 KB when downloaded and it clearly can't be extracted.
What I know:
The file is healthy and extractable on server side.
The data is transferred and downloaded on the client but inflated and broken.
The respone variable in the JavaScript function looks like binary data (no header whatsoever)
What I don't know:
Why is the data size increased even though I'm reading and sending the compressed file as binary and both content types are set to 'application/gzip'?
If something is added to the file, what is it and when is it being added?
Thank you,
After spending a few hours on this, the following worked for me.
The trick was to use the hexify on the Django view part. The sending part in the Django view should be :
try:
with open(tar_path.as_posix(), 'rb') as f:
file_data = binascii.hexlify(f.read())
response = HttpResponse(str(file_data), content_type='application/gzip')
response['Content-Disposition'] = 'attachment; filename="%s"' % userid
os.remove(tar_path.as_posix())
except IOError:
response = HttpResponse('File not exist')
return response
and the JS part should create a bytearray:
var r = response.substring(2, response.length - 1);
var typedArray = new Uint8Array(r.match(/[\da-f]{2}/gi).map(function (h) {
return parseInt(h, 16)
}));
download(typedArray, 'workspace.tar.gz', 'application/gzip');
I am trying to unit test my file uploading REST API. I found online some code generating the image with Pillow but it can't be serialized.
This is my code for generating the image :
image = Image.new('RGBA', size=(50, 50), color=(155, 0, 0))
file = BytesIO(image.tobytes())
file.name = 'test.png'
file.seek(0)
Then I try to upload this image fille :
return self.client.post("/api/images/", data=json.dumps({
"image": file,
"item": 1
}), content_type="application/json", format='multipart')
And I get the following error:
<ContentFile: Raw content> is not JSON serializable
How can I transform the Pillow image so it's serializable?
I wouldn't recommend submitting your data as JSON in this case, as it complicates the issue. Just make a POST request with the parameters and files you want to submit. Django REST Framework will handle it just fine without you needing to serialise it as JSON.
I wrote a test for uploading a file to an API endpoint a while back which looked like this:
def test_post_photo(self):
"""
Test trying to add a photo
"""
# Create an album
album = AlbumFactory(owner=self.user)
# Log user in
self.client.login(username=self.user.username, password='password')
# Create image
image = Image.new('RGB', (100, 100))
tmp_file = tempfile.NamedTemporaryFile(suffix='.jpg')
image.save(tmp_file)
# Send data
with open(tmp_file.name, 'rb') as data:
response = self.client.post(reverse('photo-list'), {'album': 'http://testserver/api/albums/' + album.pk, 'image': data}, format='multipart')
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
In this case, I used the tempfile module to store an image generated using Pillow. The with syntax used in the example allows you to pass the content of the file in the request body comparatively easily.
Based on this, something like this should work for your use case:
image = Image.new('RGBA', size=(50, 50), color=(155, 0, 0))
file = tempfile.NamedTemporaryFile(suffix='.png')
image.save(file)
with open(file.name, 'rb') as data:
return self.client.post("/api/images/", {"image": data, "item": 1}, format='multipart')
Incidentally, depending on your use case it might be more convenient to accept the image data as a base 64 encoded string.
You converted the file to bytes, which is not JSON serializable.
Without knowing what your API expects to receive, I'll have to take a guess that you have to encode file as a string: "image": file.decode('utf-8').
While there are many solutions to your general issue of unit testing image uploads to a REST API
I followed the steps in media/upload. I wrote this function in python
def upload_media(self,access_token,image_url):
client = self.get_client(access_token)
message = {'media' : image_url}
encoded_status = urllib.urlencode(message)
url = "https://upload.twitter.com/1.1/media/upload.json?"+ encoded_status
resp, content = client.request(url,'post')
return content
And I got this :
{"request":"\/1.1\/media\/upload.json","error":"media type unrecognized."}
As far as I can tell, the error is in trying to upload a URL. The Twitter API requires you to upload a base64-encoded image.
See: https://dev.twitter.com/rest/reference/post/media/upload
So instead of the image's URL, it should be the file content:
with open('example.jpg', 'rb') as f:
data = f.read()
message = {'media':data}
Optionally (I still haven't figured out whether this is required or not, as different people give different answers), you could encode the image in base-64 encoding:
with open('example.jpg', 'rb') as f:
data = f.read()
data = data.encode('base64')
message = {'media':data}
I'm trying to upload a file into my web service (written using DJango REST framework). I have written the following code but I get data can not be converted to utf-8 error
with open('/images/img.jpg', 'rb') as imgFile:
content = imgFile.read ()
json = { 'fileName': 'img.jpg', 'img': content}
json_data = simplejson.dumps(json)
reqURL = urllib2.Request("http://localhost:8000/uploadfile/",json_data)
opener = urllib2.build_opener()
f = opener.open(reqURL)
What is the right way of passing file content over JSON?
You don't send files like this. File contents are sent by embedding them inside the request body.
You may be better of by using the beautiful python-request library. Check out the file upload section.