Check that textInput does not exceed 256 (dialogflow) - python-2.7

I send a query to dialogflow from python api and I got the error:
Text length must not exceed 256 bytes.
I calculate lenght of my text like this :
l=len(Text)
But I still get the error that my texts exceed 256.
So I want to khow how to check that my text doesn't exceed 256?

This (len(Text.encode('utf-8'))) should work better than just len(Text).
Good luck!

Related

TextSizeLimitExceededException when calling the DetectPiiEntities operation

I am using aws comprehend for PII redaction, Idea is to detect entities and then redact PII from it.
Now the problem is this API has a Input text size limit. How can I increase the limit ?? Maybe to 1 MB ?? Or is there any other way to detect entities for large text.
ERROR: botocore.errorfactory.TextSizeLimitExceededException: An error occurred (TextSizeLimitExceededException) when calling the DetectPiiEntities operation: Input text size exceeds limit. Max length of request text allowed is 5000 bytes while in this request the text size is 7776 bytes
There's no way to increase this limit.
For input text greater than 5000 bytes, you can split the text into multiple chunks of 5000 bytes each and then aggregate the results back.
Please do mind that you keep some overlap between different chunks, to carry over some context from previous chunk.
For reference you can use similar solution exposed by Comprehend team itself . https://github.com/aws-samples/amazon-comprehend-s3-object-lambda-functions/blob/main/src/processors.py#L172

How to increase the size of the input in lucee

What is the reason behind this following issue, how to increase the size of the input in lucee. I don't know what I do. Please help me.
I knew how to increase that in Coldfusion. In Coldfusion we use server settings-Request Size Limits
lucee.runtime.exp.NativeException: The input was too large. The specified input was 15,307 bytes and the maximum is 15,000 bytes.

is it possible to determine the max length of a field in a csv file using regex?

This has been discussed in stackoverflow before but I couldn't find a case/answer that might apply to my situation:
From time to time I have raw data in text to be imported into SQL, for almost every case I must try out several times as SSIS wizard doesn't know what's the max size of each field and the default is 50 characters. Only after it fails I can know from the error message which (first) field was truncated and I then increase the field's size.
There might be more than one field that needs getting its size increased, and the SSIS wizard only gives one error each time it encounters a truncate, as you can see this is very tedious, I want to find a way to have a quick inspect to the data first to determine the max size of each field.
I came across an old post on stackoverflow: Here is the post
Unfortunately it might not work on my case: my raw data could have as many rows as 10 Million (yes, in one single text file which is over GB).
I am kind of do not think there would be a way to get that, but just still want to post my question here hoping to get some clue.
Thank you very much.

Coldfusion 8 - Problems with indexing large data using verity

I am currently running coldfusion 8 with verity running on a K2 server. I am using a query to index several different columns with my table using cfindex. One of the columns is a large varchar type.
It seems that when the data is being indexed only the first 30KB is being stored, resulting in no results being brought back if I search for anything after that. I tried moving several different phrases and words further up in the data, within the 30KB and the results then appear.
I then carried out more verity tests using the browse command in the command prompt to see whats actually in the collection.
i.e. Coldfusion8\verity\collections\\parts browse 0000001.ddd
I found out that the body being indexed (CF_BODY) never exceeds the size of 32000.
Can anyone tell me if there is a fixed index size per document for verity?
Many thanks,
Richard
Punch line
Version 6 has operator limits:
up to 32 764 children in one "topic" for ANY operator
up to 64 children for NEAR
Exceeding these values doesn't necessarily give error message. When you search, you're certain you don't exceed them?
Source
Verity documentation, Appendix B: Query limits says there are two limitations: search time and operator's. Quote below is whole section telling about the latter, straight from the book.
Verity Query Language and Topic Guide, Version 6.0:
Note the following limits on the use of operators:
There can be a maximum of 32,764 children for the ANY operator. If a topic exceeds
this limit, the search engine does not always return an error message.
The NEAR operator can evaluate only 64 children. If a topic exceeds this limit, the
search engine does not return an error message.
For example, assume you have created a large topic that uses the ACCRUE operator with
8365 children. This topic exceeds the 1024 limit for any ACCRUE-class topic and the
16000/3 limit for the total number of nodes.
In this case, you cannot substitute ANY for ACCRUE, because that would cause the topic
to exceed the 8,000 limit for the maximum number of children for the ANY operator.
Instead, you can build a deeper tree structure by grouping topics and creating some
named subnodes.

Google Charts API: Using a fixed set of datapoints

Is there a way to set the length of a data series using Google Charts i.e. send in 40 values and stipulate that the range is 256 values and have it plot the 40 values and leave room for (256-40) more values in the chart?
To get the idea, think of a finance intraday chart, at 10 o clock it displays only the data that is gotten by that time, but the chart still shows all of the space that eventually WILL get filled (when the trading day is over, that is).
I'd say to get a live preview of the effect to be accomplished here, see finance.google.com and look at the chart before 4 o'clock this afternoon and you'll see that it is not completely filled, although the chart is always the same "size" in terms of datarange.
Fill the rest of the values using the _ (or __ depending on your encoding) special value to indicate "no data".
See the documentation for simple encoding for additional information on this. Text encoding uses negative values to indicate missing data.