Remove alt attributes from HTML - regex

I have a huge HTML file which I'm trying to format to be able to import the content into a different application. The one thing that remains is that I need to remove all alt attributes from the HTML entirely. They all have different values and there are around 5000 of them, so clearly a straight find & replace isn't an option. Perhaps there's a way to find and replace with regex in Visual Web developer?
The tools/skills I have available are: HTML, Javascript, ASP (Classic), a little bit of .NET, Visual Web Developer Express 2010, but the only similar things I can find are PHP-based and they don't explain fully enough for me to set up a solution and feed the HTML to it.
I've found things like this: Regular expression to replace several html attributes, which give suggestions of regex functions which do similar things, but I'm not even sure how to run a regex function on a straight HTML file (my browser is struggling with the size of the HTML file as it is, so I don't think javascript is going to cut it).
Can anyone suggest the best way to accomplish this?
Thanks folks...

Since you use Visual Studio, you can try the Regex search & replace option, though the implementation of regexes in Visual Studio is pretty different from other regex engines.
Here's a short article about it:
http://www.codinghorror.com/blog/2006/07/the-visual-studio-ide-and-regular-expressions.html
As it says in the article, the builtin regex engine isn't ideal. They mention a plugin with implements standard regexes though:
http://www.codeproject.com/Articles/9125/Standard-Regular-Expression-Searcher-Addin-For-VS

Related

How to colorize plain text files in browser based on multiple regular expressions?

I do often have to look at long plain text log file in browser and I find hard to spot the error message in that amount of text.
Most error messages are easily identifiable using regular expressions. I have being using this trick to highlight important text in iTerm2 for years.
Now the interesting challenge is how to bring the same feature to the browser?
I mention that I need to be able enable this feature per domain as it may have undersirable side effects on others.
A cross browser solution would be ideal but I am ready to switch browser (Safari, Chrome, Firefox).
I know of GreaseMonkey for Firefox which allows you to inject javascript into a webpage. I'm not familiar with who you create scripts to use with it but perhaps you could match the messages using regex. Not sure how you'll highlight a plain text file using it though.
You can easily develop an Extension which will inject a CSS file on whichever webpage you want. Now, in this CSS file you will need a pseudo selector(Do all the text elements have same class or other attribute?). You need to figure this out yourself. Once you have hold of all these elements you can add a CSS rule to change their color to whatever you like.

Sublime Text: interactive confirm for replace?

I need to do a lot of search/replace across 50+ files, and am using Sublime Text 3.
Is there a way to step through and interactively confirm each change? I dont't want a blanket Replace All action that just performs all replacements.
I am thinking way back to vi/vim with its %s/old/new/gc functionality.
Both the Find/Replace and Find in Files/Replace commands don't natively support prompting you if the replacement should happen. Regular in-buffer find/replace just replaces directly and the only confirmation that you can get is when you do a Find in Files and Sublime prompts you to confirm the replacement after telling you how many replacements will be made.
As such, the only way to get something like this is to look to an external plugin/package that would do it's own find and replace option so that you could be asked to confirm the changes.
I'm not personally aware of any packages that would do this, but a search in Package Control turns up the RegReplace package, which lists among its features:
Create commands that highlight results and requiring confirmation before replacing.
That said I've never used the package myself, and from briefly looking at the documentation site it seems like it's only capable of searching in the current document and not across files.
A potential workaround would be to use the native Find in Files to find all files with matches, then manually open them and use RegReplace to perform the same operation again.

Autoclose xml tags in C/C++ file in vim

I have some documentation strings embedded within the source code (C/C++ files) as XML tags and I'd like to know what's the most minimal solution to make vim autoclose the tags (closest matching tag).
I've found closetag.vim but is there away to do this neatly without modifying anything but the .vimrc file?
Vim has no built-in support for that, so the closetag.vim plugin is the proper and easiest solution. (I use it myself, too!) Of course, you can develop your own simple mappings (that search backwards for an open tag, get that, drop the attributes, add the slash, and insert that), but:
that will either be very simplistic and therefore often wrong
or ends up with as much complexity as closetag, becoming a reimplementation of that plugin
If some rather strange restrictions (e.g. a custom primitive sync across systems) only allow you to manipulate the ~/.vimrc itself, you could just append the entire plugin's code to it (though I'd recommend against such an ugly hack).

Extracting key words from HTML to C++ under linux

I am working on a simple client-server project. Client is written in Java, it sends key words to C++ server written under Linux and recives a list of URLs with best ranks ( depending on number of occurrences of key words ). Server's job is to go through some URLs in search of key words and return best-fitting URLs. And now the problem is that I have to parse HTML sites to find occurrences of key words, plus I need to extract links from visited page to search on them as well. And my question is what library can I use to do that? Remember only C++ linux libraries are suitable for me. There were some similar topics, so I tried to go through most of them, but some of libraries parse only html files and I don't want to download every site I visit, but parse it on the fly and just store it's rank and url. Some of them look a bit complicated to me - for instance firstly parsing HTML to XML or something else and then finally work on the results with C++. Is there something simple and sufficient to do what I need it to do? Any advise will be appreciated.
I don't think regular expressions are appropriate for HTML parsing. I'm using libxml2, and I enjoy it very much - easy to use, portable and lightning fast.
To get URLs from the web using C/C++ you could use the libcurl library. To parse URLs and other not too easy stuff from the site you can use a regex library.
Separating the HTML tags from the real content can also be done without the use of a library.
For more advanced stuff one could use Qt which offers classes such as QWebPage (which uses WebKit) that allows one to access the DOM-Model of the page and extract individual HTML objects (e.g. single cells of a table) rather easyly.
You can try xerces-c. It's a powerful library for xml parsing. It support xml reading on the fly, dom and sax parsing.

Is there an editor to autoindent / add spaces in existing code?

I have some JavaScript code which I want to make more readable. However, due to the amount of code I want to use a tool which does this automatically for me.
Are there such tools already available or do I have to manually perform some "find/replace-alls"?
The code which I want to convert is written on a single line without spaces.
Quick search found http://jsbeautifier.org/ which seems to do what you are looking for online.
Search terms: javascript beautifier