How to get page break character in pycharm

How to get page break character in pycharm - regex

So extracting text with pdftotext is quicker if you have it all in one file instead of one page per file. There is a special character being used as page break, but pycharm does not register it if I search for it. In sublime it shows up as <0x0c>. I cannot copy it here either. Is there a way to replace them with another readable character? Or to read it somehow to register when a page has changed?

Related

Search/replace in block selection in Notepad++

Is there a way to limit search/replace only to a columnar block selection in Notepad++?
Here is what I am trying to do:
I am bulk-editing metadata extracted from large numbers of photos.
The metadata comes to me as a csv file with no quotes around fields in header line and no quotes around first field in each succeeding line.
I edit this file in Open Office calc which exports with quotes around all fields.
I can easily edit header row but the problem comes in stripping quotes from only first field in successive lines.
I can use notepad in columnar mode but, after selecting the first column, the 'search only in selection' option box is greyed out.
I can do this by hand but it means lots of hand-work and increased chance of error.

I know, this probably won't help you any more, but I just had the same problem and stumbled across this question.
I found moving the block in question to a new file and performing the find/replace there works quite decently. When moving the block back, be sure to select it in block mode (see this question).

No. Another editor may have this feature.

sort of a late reply but... I had the same problem when I moved to a new machine with Notepad++ installed. Previously, I was using a text editor called Boxer that had this feature, which I found invaluable. Its not free-ware however.

You may not be able to Search/Replace within a columnar selection, but you can easily carry out your task within Notepad++. Use Find and Replace feature, with the Regular Expressions box checked.
If you want to remove quotes only from a target column, use the following regular expression in the Find field:
(^([^,]*,){i})"([^,\n\r]*)"(.*$)
Replace i with the position of the target column minus 1.
(i.e.- Us 2 if you want quotes around the third column, 0 for the first column, etc)
In the Replace field use:
\1\3\4
Clicking "Replace All" will strip quotes from the target column.
If you want to blow away all quotes surrounding each element in your csv without prejudice, use the following regular expression in the Find field:
((?<=,)|(?<=^))"(.*?)"((?=$|,))
In the Replace field use:
\1\2\3
Clicking Replace All will strip quotes form the columns.
Example
Since you didn't provide an example csv file, I'll walk through my own working example. Below is my csv:
"0","1","2","3","4","5","6","7","8","9"
"10","11","12","13","14","15","16","17","18","19"
"20","21","22","23","24","25","26","27","28","29"
"30","31","32","33","34","35","36","37","38","39"
"40","41","42","43","44","45","46","47","48","49"
"50","51","52","53","54","55","56","57","58","59"
"60","61","62","63","64","65","66","67","68","69"
"70","71","72","73","74","75","76","77","78","79"
"80","81","82","83","84","85","86","87","88","89"
"90","91","92","93","94","95","96","97","98","99"
"100","101","102","103","104","105","106","107","108","109"
"110","111","112","113","114","115","116","117","118","119"
"120","121","122","123","124","125","126","127","128","129"
"130","131","132","133","134","135","136","137","138","139"
"140","141","142","143","144","145","146","147","148","149"
"150","151","152","153","154","155","156","157","158","159"
"160","161","162","163","164","165","166","167","168","169"
"170","171","172","173","174","175","176","177","178","179"
"180","181","182","183","184","185","186","187","188","189"
"190","191","192","193","194","195","196","197","198","199"
If I wanted to remove quotes from the second column, I would use the below Find and Replace fields
(^([^,]*,){1})"([^,\n\r]*)"(.*$)
\1"\3"\4
Clicking Replace All yields the below result:
"0",1,"2","3","4","5","6","7","8","9"
"10",11,"12","13","14","15","16","17","18","19"
"20",21,"22","23","24","25","26","27","28","29"
"30",31,"32","33","34","35","36","37","38","39"
"40",41,"42","43","44","45","46","47","48","49"
"50",51,"52","53","54","55","56","57","58","59"
"60",61,"62","63","64","65","66","67","68","69"
"70",71,"72","73","74","75","76","77","78","79"
"80",81,"82","83","84","85","86","87","88","89"
"90",91,"92","93","94","95","96","97","98","99"
"100",101,"102","103","104","105","106","107","108","109"
"110",111,"112","113","114","115","116","117","118","119"
"120",121,"122","123","124","125","126","127","128","129"
"130",131,"132","133","134","135","136","137","138","139"
"140",141,"142","143","144","145","146","147","148","149"
"150",151,"152","153","154","155","156","157","158","159"
"160",161,"162","163","164","165","166","167","168","169"
"170",171,"172","173","174","175","176","177","178","179"
"180",181,"182","183","184","185","186","187","188","189"
"190",191,"192","193","194","195","196","197","198","199"

My search on internet, to to see weather notepad++ suports this; brought me here.
I have used TextPad and confirm that it supports find-and-replace within column selected block. Also TextPad is free for personal use.

Folder with 1300 png files into html images list

I've got folder with about 1300 png icons. What I need is html file with all of them inside like:
<img src="path-to-image.png" alt="file name without .png" id="file-name-without-.png" class="icon"/>
Its easy as hell but with that number of files its pure waste of time to do it manually. Have you any ideas how to automate it?

If you need it just once, then do a "dir" or "ls" and redirect it to a file, then use an editor with macro-ability like notepad++ to record modifying a single line like you desire, then hit play macro for the remainder of the file. If it's dynamic, use PHP.

I would not use C++ to do this. I would use vi, honestly, because running regular expressions repeatedly is all that is needed for this.
But young an do this in C++. I would start with a plan text file with all the file names generated by Dir or ls on the command prompt.
Then write code that takes a line of input and turns it into a line formatted the way you want. Test this and get it working on a single line first.
The RE engine of C++ is probably overkill (and is not all that well supported in compilers), but substr and basic find and replace is all you need. Is there a string library you are familiar with? std::string would do.
To generate the file name without PNG, check the last four characters and see if they exist and are .PNG (if not report an error). Then strip them. To remove dashes, copy characters to a new string but if you are reading a dash write a space. Everything else is just string concatenation.

ColdFusion -- Do I need URLDecode with form POSTs? / URLDecode randomly removes one character

I'm using a WYSIWYG to allow users to format text. This is the error-causing text:
<p><span style="line-height: 115%">This text starts with a 'T'</span></p>
The error is that the 'T' in "This", or whatever the first letter happens to be, is randomly removed when using URLDecode and saving to the DB. Removing URLDecode on the server side seems to fix it without any negative side-effects (the DB contains the same information).
The documentation says that
Query strings in HTTP are always URL-encoded.
Is this really the case? If so, why doesn't removing URLDecode seem to mess everything up?
So two questions:
Why is URLDecode causing the first text character to be removed like this (it seems to only happen when the line-height property is present)?
Do I really need (or would I even want) to use URLDecode before putting POSTed data into the database?
Edit: I made a test page to echo back the decoded text, and URLDecode is definitely removing that character, but I have no idea why.

I believe decoding is done automatically when form scope is populated. That's why characters after % (this char is used for encoding) are removed -- you are trying to decode the string second time.
For security reasons you might be interested in stripping script tags, or even cleaning up HTML using white-list. Try to search in CFLib.org for applicable functions.

OutputDebugString + DebugView = not tabs!

I am dumping \t delimited data using using OutputDebugString and then use ex-Sysinternals DebugView to capture it.
The problem is that all the data in DebugView appear to be space delimited, hence I need to perfrorm CTRL+H "\x20" "t" to replace spaces with the tabs before I can use it (I really need tab delimited data).
Is there anyway to tell DebugView not to replace tabs with spaces?
Or maybe there is a better tool available to capture output of the OutputDebugString function?
Any ideas are very welcome!

It seems this is a "feature" in DebugView. I have tried with Hoo Wintail and this dude collects tabs without any problem. So I see 3 solutions:
You get Hoo Wintail (highly recommended)
You write your on tool (look here for some idea how to do it or even get a complete one)
You redirect to file.
I strongly vote for option 1.

Why not write them on a local log-file ? (only on debug mode ?)

You can use multiple spaces instead of a tab.

DebugOutput and DebugView are intended for situations as implied by their name: debug. They are not intended to replace file-save functionality.
You are probably in the situation where analyzing the debug output means analyzing the tab-delimited format. Find another character that can be used instead of tab, e.g. | or # or ^.
Then open the debug output in an advanced editor (e.g. UltraEdit) and convert the character back to Tab.

Regular Expression Carriage Return Find & Replace on Google Docs

On Google Docs, I want each list-item (my bullet is an en dash, "-"), of which there are over 1,000 in 20 or so documents, to be separated by an additional line feed. It makes it easier to read on mobile devices.
How can I search for a line feed delimiting a bullet, and replace it with two line feeds?
(I.e. the equivalent of searching for "^p-" and replacing it with "^p^p-" in Microsoft Word)

I am a little confused about your question, but:
I found the only way to do something like this is to go to find each - or space (which Google Docs will find) and insert a "dummy" character.
Then do a search and replace with ,
Then copy all, delete all, go to the html mode and paste it.
Go back to normal view and you should have an extra space after each.
If you are trying to do this:
(I.e. the equivalent of searching for "^p-" and replacing it with "^p^p-" in Microsoft Word)
Go to html mode, and find out which tag is causing the line break, (could be div tag or a br tag (I can't get the tag characters <> to show), copy it, go back to normal mode, paste all,use copy and replace on whatever it was (br, or div) with two of them, copy, delete, paste back in html, and you should have the extra line break. Hope that helps.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to get page break character in pycharm - regex

Related

Search/replace in block selection in Notepad++

Folder with 1300 png files into html images list

ColdFusion -- Do I need URLDecode with form POSTs? / URLDecode randomly removes one character

OutputDebugString + DebugView = not tabs!

Regular Expression Carriage Return Find & Replace on Google Docs

Categories

Resources