Web Application Cross Site Scripting - xss

My website http://www.imayne.com seems to have this issue, verified by MacAfee. Can someone show me how to fix this? (Title)
It says this:
General Solution:
When accepting user input ensure that you are HTML encoding potentially malicious characters if you ever display the data back to the client.
Ensure that parameters and user input are sanitized by doing the following:
Remove < input and replace with "&lt";
Remove > input and replace with "&gt";
Remove ' input and replace with "&apos";
Remove " input and replace with "&#x22";
Remove ) input and replace with "&#x29";
Remove ( input and replace with "&#x28";
I cannot seem to show the actual code. This website is showing something else.
Im not a web dev but I can do a little. Im trying to be PCI compliant.

Let me both answer your question and give you some advice. Preventing XSS properly needs to be done by defining a white-list of acceptable values at the point of user input, not a black-black of disallowed values. This needs to happen first and foremost before you even begin thinking about encoding.
Once you get to encoding, use a library from your chosen framework, don't attempt character substitution yourself. There's more information about this here in OWASP Top 10 for .NET developers part 2: Cross-Site Scripting (XSS) (don't worry about it being .NET orientated, the concepts are consistent across all frameworks).
Now for some friendly advice: get some expert support ASAP. You've got a fundamentally obvious reflective XSS flaw in an e-commerce site and based on your comments on this page, this is not something you want to tackle on your own. The obvious nature of this flaw suggests you've quite likely got more obscure problems in the site as well. By your own admission, "you're a noob here" and you're not going to gain the competence required to sufficiently secure a website such as this overnight.

The type of changes you are describing are often accomplished in several languages via an HTML Encoding function. What is the site written in. If this is an ASP.NET site this article may help:
In PHP use this function to wrap all text being output:
Anyplace you see echo(...) or print(...) you can replace it with:
echo(htmlentities( $whateverWasHereOriginally, ENT_COMPAT));
Take a look at the examples section in the middle of the page for other guidance.

Follow those steps exactly, and you're good to go. The main thing is to ensure that you don't treat anything the user submits to you as code (HTML, SQL, Javascript, or otherwise). If you fail to properly clean up the inputs, you run the risk of script injection.
If you want to see a trivial example of this problem in action, search for
<span style="color:red">red</span>
on your site, and you'll see that the echoed search term is red.


Submit form after user tweet

I have an idea and I'm not sure if it'll work (and I hope this is the appropriate place to ask this question).
Basically, what I'm trying to do is grab a user's tweet (one user in particular), and if the tweet matches a RegEx pattern, grab a part of it and submit a form on another page. The form has two parts, the first part is for the information, and the second is just a confirmation.
The only two languages I know that would have this capability (to my knowledge) are PHP and Java. Unfortunately my knowledge of PHP is fairly mediocre and Java would be pretty basic. This being said I'd need to do some research.
First of all, is what I want to do even possible, and secondly, what would I have to be looking at to pull this off? (I'm open to learning a similar language if necessary)

A tool which checks that a local version of a site is fully translated (for continuous integration)

I'm working on a project, in which we design a localized version of an existing site (written in English) for another country (which is not English-speaking). And the business requirement is "no English text for all possible and impossible cases".
Does anyone know if there is a checker software/service which could check if a site is fully translated, that is which checks that there are no English text in it.
I new that there are sites for checking broken links, html validity etc, I need something like http://validator.w3.org/checklink but for checking that on all pages of the site there is no English text.
The reasons I think this way is needed are:
1. There is a lot of code which is common (both on backend and frontend) for all countries
2. If someone commits anything to the common code I need to be sure that this will not lead to english text issues in localized version.
3. From business point of view it is preferable that site does not support some functionality, than it shows english text ( legal matters)
4. The code both on frontend and backend changes a lot
5. There are a lot of files which affect text on the client's screen. Not just one with messages, unfortunately. And some of messages comes from backend, but most of them are in frontend
6. Due to all those fact currently someone manually fills all the forms and watch with his own eyes, and that is before each deploy...
I think you're approaching the problem from the wrong direction. You're looking for an algorithm or webcrawler that can detect wether any text is English or not? I don't know, but I doubt such a thing even exists.
If you have translated the website, you have full access to the codebase and/or translation texts, right? Can't you just open both the English and non-English strings files (.resx or whatever you are using) in a comparetool like Notepad++ to check the differences to see if there are any missing strings? And check the sourcecode and verify that all parts that can output user-displayable text use the meta:resourceKey property (or whatever you are using).
If you want to go the way of crawling, I'm not aware of an existing crawler that does this, but it sounds like a combination of two simple issues:
Finding existing open-source code for a web crawler should be dead simple
Identifying a language through n-gram analysis is trivial if there's a limited number of languages the text can be in.
The only difficult part would be to ensure that the analyzer always has a decent chunk of text to work with. You could extract stuff paragraph by paragraph. For forms you'd probably have to combine the text of several form labels.

how to handle multiple languages on website

I have a website that I am translating into different languages. I have the content translated and stored in a database. I also wrote, into the php files, different mechanisms that will display the language based on a global define I set high in the code. I am happy with all of this. My question is how do I control this global define?
I currently have a javascript toggle that sets a cookie and then reloads the current page. And every subsequent page just reads that cookie to set the global define. It works very well, however I am running into two big problems. (1) I can't just can't have a url to send to somebody that has the language in it (I could do something like domain.com/forwarder.php?lan=spanish&gotopage=page.php that would set a cookie and then forward, but that's ugly). And (2), search engines can't view the multiple languages since they don't really use cookies and javascript.
So how do I solve this? Does anybody have experience in this? Can you share your experiences?
I'm leaning towards just using the url and dropping the cookie; that seems popular among various international sites I've seen. So I'm guessing the urls would be:
domain.com/page (for english, equivalent to domain.com/en/page)
domain.com/es/page (for spanish)
domain.com/fr/page (for french)
etc ......
Is this a good idea? I will have to go through my code and prepend all my href's with the language code, which might be a pain.
So does anybody have any comments on this? Is this a good plan? Am I neglecting to realize something?
It's been a long time, but can't you use the $_SERVER["HTTP_ACCEPT_LANGUAGE"] and set it automatically. And prior to writing the cookie for the first time, leave message on the screen in either english or another language in the array asking if this is the correct language, with a drop down of available languages? Once it is selected, store that as default website language.
You can use string constants in global resource files. Have only one website that calls those string constants based on the current language.

What are my options for white-listing HTML in ColdFusion?

I want to allow my users to input HTML.
Allow a specific set of HTML tags.
Preserve characters (do not encode ã into ã, for example)
Existing options
AntiSamy. Unfortunately AntiSamy encodes special characters and breaks requirement 2.
Native ColdFusion functions (HTMLCodeFormat() etc...) don't work as they encode HTML into entities, and thus fail requirement 1.
I found this set of functions somewhere, but I have no way of telling how secure this is: http://pastie.org/2072867
So what are my options? Are there existing libraries for this?
Portcullis works well for Cold Fusion for attack-specific issues. I've used a couple of other regex solutions I found on the web over time that have worked well, though they haven't been nearly as fleshed out. In 15 years (10 as a CMS developer) nothing I've built has been hacked....knock on wood.
When developing input fields of any type, it's good to look at the problem from different angles. You've got the UI side, which includes both usability and client-side validation. Yes, it can be bypassed, but javascript-based validation is quicker, more responsive, and rates higher on the magical UI scale than backend-interruption method or simply making things "disappear" without warning. It will speed up the back-end validation because it does the initial screening. So, it's not an "instead of" but an "in-addition to" type solution that can't be ignored.
Also on the UI front, giving your users a good quality editor also can make a huge difference in the process. My personal favorite is CKeditor simply because it's the only one that can handle Microsoft Word code on the front-side, keeping it far away from my DB. It seems silly, but Word HTML is valid, so it won't setoff any red flags....but on a moderately sized document it will quickly overload a DB field insert max, believe it or not. Not only will a good editor reduce the amount of silly HTML that comes in, but it will also just make things faster for the user....win/win.
I personally encode and decode my characters...it's always just worked well so I've never changed practice.

Is there any reason to sanitize user input to prevent them from cross site scripting themself?

If I have fields that will only ever be displayed to the user that enters them, is there any reason to sanitize them against cross-site scripting?
Edit: So the consensus is clear, that it should be sanitized. What I'm trying to understand is why? If the only user that can ever view the script they insert into the site is the user himself, then the only thing he can do is execute the script himself, which he could already do without my site being involved. What's the threat vector here?
Theoretically: no. If you are sure that only they will ever see this page, then let them script whatever they want.
The problem is that there are a lot of ways in which they can make other people view that page, ways you do not control. They might even open the page on a coworker's computer and have them look at it. It is undeniably an extra attack vector.
Example: a pastebin without persistent storage; you post, you get the result, that's it. A script can be inserted that inconspicuously adds a "donate" button to link to your PayPal account. Put it up on enough people's computer, hope someone donates, ...
I agree that this is not the most shocking and realistic of examples. However, once you have to defend a security-related decision with "that is possible but it does not sound too bad," you know you crossed a certain line.
Otherwise, I do not agree with answers like "never trust user input." That statement is meaningless without context. The point is how to define user input, which was the entire question. Trust how, semantically? Syntactically? To what level; just size? Proper HTML?
Subset of unicode characters? The answer depends on the situation. A bare webserver "does not trust user input" but plenty of sites get hacked today, because the boundaries of "user input" depend on your perspective.
Bottom line: avoid allowing anybody any influence over your product unless it is clear to a sleepy, non-technical consumer what and who.
That rules out almost all JS and HTML from the get-go.
P.S.: In my opinion, the OP deserves credit for asking this question in the first place. "Do not trust your users" is not the golden rule of software development. It is a bad rule of thumb because it is too destructive; it detracts from the subtleties in defining the frontier of acceptable interaction between your product and the outside world. It sounds like the end of a brainstorm, while it should start one.
At its core, software development is about creating a clear interface to and from your application. Everything within that interface is Implementation, everything outside it is Security. Making a program do the things you want it to is so preoccupying one easily forgets about making it not do anything else.
Picture the application you are trying to build as a beautiful picture or photo. With software, you try to approximate that image. You use a spec as a sketch, so already here, the more sloppy your spec, the more blurry your sketch. The outline of your ideal application is razor thin, though! You try to recreate that image with code. Carefully you fill the outline of your sketch. At the core, this is easy. Use wide brushes: blurry sketch or not, this part clearly needs coloring. At the edges, it gets more subtle. This is when you realize your sketch is not perfect. If you go too far, your program starts doing things that you do not want it to, and some of those could be very bad.
When you see a blurry line, you can do two things: look closer at your ideal image and try to refine your sketch, or just stop coloring. If you do the latter, chances are you will not go too far. But you will also make only a rough approximation of your ideal program, at best. And you could still accidentally cross the line anyway! Simply because you are not sure where it is.
You have my blessing in looking closer at that blurry line and trying to redefine it. The closer you get to the edge, the more certain you are where it is, and the less likely you are to cross it.
Anyway, in my opinion, this question was not one of security, but one of design: what are the boundaries of your application, and how does your implementation reflect them?
If "never trust user input" is the answer, your sketch is blurry.
(and if you don't agree: what if OP works for "testxsshere.com"? boom! check-mate.)
(somebody should register testxsshere.com)
Just because you don't display a field to someone, doesn't mean that a potential Black Hat doesn't know that they're there. If you have a potential attack vector in your system, plug the hole. It's going to be really hard to explain to your employer why you didn't if it's ever exploited.
I don't believe this question has been answered entirely. He wants to see an accuall XSS attack if the user can only attack himself. This is actually done by a combination of CSRF and XSS.
With CSRF you can make a user make a request with your payload. So if a user can attack himself using XSS, you can make him attack himself (make him make a request with your XSS).
A quote from The Web Application
Hacker’s Handbook:
“We’re not worried about that low-risk XSS bug. A user could exploit it only to attack himself.”
Even apparently low-risk vulnerabilities can, under the right circumstances, pave the way for a devastating attack. Taking a defense-in-depth approach to security entails removing every known vulnerability, however insignificant it may seem. The authors have even used XSS to place file browser dialogs or ActiveX controls into the page response, helping to break out of a kiosk-mode system bound to a target web application. Always assume that an attacker will be more imaginative than you in devising ways to exploit minor bugs!
Yes, always sanitize user input:
Never trust user input
It does not take a lot of effort to do so.
The key point being 1.
If the script, or service, that the form submits the values to is available via the internet then anyone, anywhere, can write a script that will submit values to it. So: yes, sanitize all inputs received.
The most basic model of web-security is pretty simple:
Do not trust your users
It's also worth linking to my answer in another post (Steps to become web-security savvy): Steps to become web security savvy.
I can't believe I answered without referring to the title-question:
Is there any reason to sanitize user input to prevent them from cross site scripting themself?
You're not preventing the user's being cross-site scripted, you're protecting your site (or, more importantly, you're client's site) from being the victim of cross-site scripting. If you don't close known security holes because you couldn't be bothered it will become very hard to get repeat business. Or good word-of-mouth advertising and recommendation from previous clients.
Think of it less as protecting your client, think of it -if it helps- as protecting your business.