Univocity parser using routines ignores the LongCoversion using defaultNullRead attribute? - univocity

I have below field configuration:
#Parsed(field="TEST_ID", defaultNullRead="000000")
private Long testId
now when the input file (csv parsing) contains value as NULL, it is not converting to default long value of 0, rather throws LongConversion exception for "NULL"
e.g. row in csv file: (5th column containing NULL is an issue)
7777|ab|444|PENDING|NULL|VESRION|TEST|11
I am using csvRoutines for parsing the input csv file

NULL in your input is actually text and not Java's null. You need to tell the parser to translate the string NULL to java null.
Add the following annotation (you can give it more than one string that represents null:
#NullString(nulls = {"NULL", "N/A", "?"})
Hope this helps

Related

How to convert text field with formatted currency to numeric field type in Postgres?

I have a table that has a text field which has formatted strings that represent money.
For example, it will have values like this, but also have "bad" invalid data as well
$5.55
$100050.44
over 10,000
$550
my money
570.00
I want to convert this to a numeric field but maintain the actual numbers that can be retained, and for any that can't , convert to null.
I was using this function originally which did convert clean numbers (numbers that didn't have any formatting). The issue was that it would not convert $5.55 as an example and set this to null.
CREATE OR REPLACE FUNCTION public.cast_text_to_numeric(
v_input text)
RETURNS numeric
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $BODY$
declare v_output numeric default null;
begin
begin
v_output := v_input::numeric;
exception when others then return null;
end;
return v_output;
end;
$BODY$;
I then created a simple update statement which removes the all non digit characters, but keeps the period.
update public.numbertesting set field_1=regexp_replace(field_1,'[^\w.]','','g')
and if I run this statement, it correctly converts the text data to numeric and maintains the number:
alter table public.numbertesting
alter column field_1 type numeric
using field_1::numeric
But I need to use the function in order to properly discard any bad data and set those values to null.
Even after I run the clean up to set the text value to say 5.55
my "cast_text_to_numeric" function STILL sets this to null ? I don't understand why this sets it to null, but the above statement correctly converts it to a proper number.
How can I fix my cast_text_to_numeric function to properly convert values such as 5.55 , etc?
I'm ok with disgarding (setting to NULL) any values that don't end up with numbers and a period. The regular expression will strip out all other characters... and if there happens to be two numbers in the text field, with the script, they would be combined into one (spaces are removed) and I'm good with that.
In the example of data above, after conversion, the end result in numeric field would be:
5.55
100050.44
null
550
null
570.00
FYI, I am on Postgres 11 right now

How to read first few rows from CSV using univocity parser

How to stop parsing after reading few rows from CSV file using iterator/row processor in univocity parser?
Update #1
I tried the below code and I'm getting empty rows.
val parserSettings = new CsvParserSettings
parserSettings.detectFormatAutomatically()
parserSettings.setEmptyValue("")
parserSettings.setNumberOfRecordsToRead(numberOfRecordsToRead)
val parser = new CsvParser(parserSettings)
val input = new FileInputStream(path)
val rows = parser.parseAll(input)
Update #2
Before passing inputstream to parser, I was using Apache Tika to detect the MIME type of the file to detect whether the file is CSV.
new Tika().detect(input)
This was altering the inputstream. Due to that Univocity parser was unable to parse correctly.
You have many different options:
From your row processor just call context.stop().
On the parser settings, you can set settings.setNumberOfRecordsToRead(10) to read 10 rows and stop.
With the parser itself, call parser.stopParsing()
Hope this helps

MongoDB C++ String encoding error on accent when inserting JSON string

I have a problem when I insert a JSON string in MongoDB a C++ function. I am basically creating a big std::string formatted as a JSON and I put my values in it.
I have some accents in strings in the data I put in the JSON and I get an error when I try to see the document correctly in the DB after.
This is my update/insert code
mongodb_client_connector.update
(
mongodb_database+"."+MONGODB_COLLECTION,
Query(BSON(MONGODB_ID << OID(param_oid))),
fromjson(The_JSON_I_Wrote)
);
This is the result:
How do I format the string correctly so I get the accents?

AutoHotkey RegExReplace with math

I am trying to change all instances of a number in an xml file. The constant 45 should be added to the number.
Temp is the following text:
<rownum value="1">
<backupapplication>HP Data Protector</backupapplication>
<policy>AUTDR12_Daily</policy>
<policytype>FileSystem</policytype>
<dataretained>31</dataretained>
<fullbackup>7</fullbackup>
<backuptime>0.17</backuptime>
<retentionperiod>Short</retentionperiod>
<peakmbps>11</peakmbps>
<backupcategory>Fulls & Fulls</backupcategory>
</rownum>
<rownum value="2">
<backupapplication>HP Data Protector</backupapplication>
<policy>AUTP_Appl_Monthly</policy>
<policytype>FileSystem</policytype>
<dataretained>268</dataretained>
<fullbackup>91</fullbackup>
<backuptime>2.31</backuptime>
<retentionperiod>Long</retentionperiod>
<peakmbps>12</peakmbps>
<backupcategory>Fulls & Fulls</backupcategory>
</rownum>
I tried the following code:
NeedleRegEx = <rownum value="(\d+)">
Replacement = <rownum value="($1+45)">
Temp := RegExReplace(Temp, NeedleRegEx, Replacement)
But this changes it into
<rownum value="1+45">
while I want
<rownum value="46">
How do I do this in AutoHotKey?
RegEx aren't designed to evaluate mathematical expressions. There are some languages, in which you can use a replacing function that can do dynamic replacements (e.g. JavaScript). But no such luck in AHK.
Using RegEx for the purpose of parsing XML documents isn't good practice anyway. I suggest using an XML parser instead. For AHK, you can utilize a COM object of MSXML2.DOMDocument. Here's an example (and further references) of how to use it: http://www.autohotkey.com/board/topic/56987-com-object-reference-autohotkey-v11/page-2#entry367838.
What you want to do is parse your XML to a DOM document and loop over every rownum tag. Now, you can retrieve the value attribute, increment it, and overwrite the attribute with the new value.
Update
To the code you've posted in the comments: There were some minor mistakes and one big mistake. The big mistake was trying to parse non-valid XML. You can check your XML files by feeding them to a formatter/validator. The loadXml()method will return false if there was a parsing error. The method obj.saveXML() does not exist. If you want to retrieve the document's string representation, simply access its xml property: obj.xml. If you want to save it to a file, there's the built-in method save(filepath).
Here's my suggestion for a clean approach (yes, you CAN use meaningful variable names!):
doc := ComObjCreate("MSXML2.DOMDocument.6.0")
if(!doc.loadXml(xmlString)) {
msgbox % "Hey! That's no valid XML!"
ExitApp
}
rownums := doc.getElementsByTagName("rownum")
Loop % rownums.length
{
rownum := rownums.item(A_Index-1)
value := rownum.getAttribute("value")
value += 45
rownum.setAttribute("value", value)
}
doc.save("myNewFile.xml")

Use a String as an E4X Expression in AS3?

I need to use a string to access nodes and attributes in XML using E4X. It would be ideal to have this scenario (with XML already loaded):
var myXML:XML = e.target.data;
var myStr:String = "appContent.bodyText.(#name == 'My Text')";
myXML.myStr = "New Value for bodyText node where attribute('name') is equal to 'My Text'";
I ultimately need to set new values to an XML document using strings as E4X expressions.
As noted above:
I figured out a workaround
Take the string of the E4X path you want to target
Pull the E4X path and compare it to your target path
If the two are equal, do what you will with that node/attribute
It's a hack, but it works. You could even parse the XML and populate an array with the target string and the target node, then you could just access it through an item in the array. This is expandable in many ways. As long as everything is set up for proper garbage collection, you'll be okay.