Regex to identify Store Credit Card numbers - regex

There are very detailed regex expressions to identify Visa, MasterCard, Discover and other popular credit card numbers.
However, there are tons of other credit cards; termed popularly as Store Credit Cards (these are not the Visa or Amex powered cards). Examples of these cards are Amazon, GAP brands, Williams Sonoma, Macy's and so on. Most of these are Synchrony Bank Credit Cards.
Is there a regex to identify these different brand credit card numbers?

It's ludicrous to use a regex to identify the network. All it takes is a prefix matching at most.
A card number has 16 digits. The first few identify the network and the bank.
Some people would say that Visa starts with 4 and MasterCard starts with 5 but that's a broad approximation at best. You can have a look at your card, should be right most of the time.
It would be easy to figure out what a card is if one could get a registry of known prefixes, but there is no public registry to my knowledge. I highly doubt that any of the parties involved would like to publish that information.

The first eight digits (until recently this was six digits) of an international card number are known as the Issuer Identification Number (IIN) and the registry that maintains this index is the American Bankers Association
The list of IINs is updated monthly and spans tens of thousands of rows. Unfortunately a fixed Regex isn't going to be accurate for any length of time.

Related

AWS transcribe speaker diarization, segments single speaker sentences into multiple different speakers

A bit of context, we have been using AWS Transcribe for English transcription since last one year. When the number of speakers is unknown, transcribe asks you to provide max number of speakers, by default we are passing 5.
Since last month we observed that the ability to differentiate between speakers has gone down drastically. Spoken words from a single speaker gets broken into multiple speaker sentences. Even when there are only 2 speakers.
Any pointers would be helpful.

How to mask credit card number mask in a text?

I have a form on my website and my customers send message to me with this form. Sometimes they write their credit card number on the message. So this is really critical. I want to mask these credit card numbers. But of course card numbers don't come on a regular basis.
Example 1: 1111222233334444
Example 2: 4444 3333 2222 1111
Example 3: 4444-3333-2222-1111
Example 4: 4444 - 3333 - 2222 - 1111
Example 5: 4444--3333--2222--1111
So I can mask for example 1, 2 and 3. But if there are more than one space or dash between numbers I can't.
And this is my last regex:
preg_replace("/(?:\b| )([3456]\d{3})([ -]+){0,1}\d{4}([ -]+){0,1}\d{4}([ -]+){0,1}(\d{0})/", "$1********$2", $a1);
And results for this regex:
Result 1: 4444********1111
Result 2: 4444******** 1111
Result 3: 4444********-1111
Result 4: 4444******** - 1111
Result 5: 4444********--1111
So what should I do in regex? Thanks.
May I suggest that you separate validation of your credit card number from the presentation of that number to your users via the UI? Assuming you have only stored valid credit card numbers, then it is probably safe to assume that every number has at least 8 digits. If so, then you can just use a blanket regex to only display the first 4 and last 8 digits:
$cc = "4444--3333--2222--1111";
echo preg_replace("/(\d{4}).*(\d{4})/", "$1********$2", $cc);
4444********1111
Demo
You might point out that this puts the same number of stars in between every card number. But, then again, this is a good thing, because it makes it even harder for a snooper to fish out what the real unmasked number actually is.
Edit:
Here is a smarter regex which will star out the middle portion of any number, leaving only the first and last 4 characters visible:
$cc = "4444--3333--2222--1111";
echo preg_replace("/(?<=.{4}).(?=.{4})/", "*", $cc);
4444**************1111
Note that this solution would not remove anything from 11114444 as a theoretical input.
How to mask credit card number mask in a text [with regex]?
Don't.
Sometimes they write their credit card number on the message.
They really shouldn't. Don't encourage this behavior. It is not PCI compliant:
What is PCI Compliance?
The Payment Card Industry Data Security Standard (PCI DSS) applies to companies of any size that accept credit card payments. If your company intends to accept card payment, and store, process and transmit cardholder data, you need to host your data securely with a PCI compliant hosting provider.
When you accept credit card data via a website, do so using an approved service provider like Stripe, PayPal, BlueSnap, SecurionPay, etc. These services are immensely popular not because it's hard to make payment systems, but because they're hard to make right (and legal). They all have PHP API's, so you can have people enter credit card data that you never see, and still charge them for amounts that you agree upon.
For example, if you were using Stripe and you wish to inform your customer what credit card they signed up with, their card object has a last4 property that gives the last four digits of the card: At this point you never knew the full credit card number, and you didn't even have to consider whether giving the first four and the last four was a security violation.
Further guidelines:
Never store electronic track data or the card security number in any form
While you may have a business reason for storing credit card information, processing regulations specifically forbid the storage of a card’s security code or any “track data” contained in the magnetic strip on the back of a credit card.
The card security number, called by many acronyms including CVV2, CID, and CSC, is the three digit number on the back of Visa/MasterCard/Discover cards or the 4 digit number on the front of American Express cards. It is designed to provide a way for merchants to know whether a customer authorizing a transaction over the phone or via the Internet actually has the card in their possession. This approach only works if the security code is never stored with the card number. Electronic storage makes this easy. You simply do not create a field for the security code. For paper storage, you need to redact (cross out with a dark pen to make unreadable) the security code after you successfully process the transaction and before you store a paper authorization form. [...]
Clearly you should store neither security codes nor track data purposely. But, you need to make sure you don’t store it inadvertently as well. To do this, be certain to use only approved hardware and software. [...]
Make sure all electronic storage of credit card account numbers is encrypted and all paper storage is secured
[...] Electronic storage of credit card numbers is also common if, for example, you process recurring or repeat transactions. If you do this, you need to make certain that you never store these files unencrypted. You need to make certain that any electronic storage is encrypted using a robust encryption algorithm. That way, if your computer is stolen or if someone in your office gains unauthorized access, you have some level of protection for the credit card numbers.
There are many service providers that offer secure storage—either as a standalone service or as part of a payment processing package. These services typically provide you with a “Token” for a card number they store. You can store the token in any unsecured file. When you’re ready to process a payment, you simply send the service provider the token and it retrieves the full card number for the sole purpose of processing the payment. (It’s technically more complicated than that, but you get the idea.) Just be certain to use a PCI DSS Verified provider [...]
Check the next regex \b([3-6]\d{3})(?: *-* *\d{4}){2} *-* *(\d{4})\b.

Should I collect and store the type of credit card (e.g. Visa) or determine it via the number?

I am writing code for entering credit cards into a database. I am aware there are regular expressions that can be used to determine type of card (Visa, MasterCard, etc). For instance, Visa's regex is /^4[0-9]{12}(?:[0-9]{3})?$/. I only deal with Visa, MasterCard, Discover Card, and American Express, with no other cards being supported.
My question is: Should I collect the card type from the user and store it in the database, or derive the card type from the card number, and not store it in the database? In other words, are there any cases where, for instance, a Visa card will not match the regex, but still be a Visa card?
This is not about Luhn checking, it's about regular expressions for determining card type.
are there any cases where, for instance, a Visa card will not match
the regex, but still be a Visa card?
Yes, that Visa expression limits to 13 or 16 digits when in reality Visa can range from 13 thru 19 digits, albeit that 16 is by far the most common.
Should I collect the card type from the user
This is discussed in depth here: Why do credit card forms ask for Visa, MasterCard, etc.?
As you comment you are experiencing problems with incorrect user input, I would simply not ask them what they are using and use a set of simple prefix and length regular expressions to identity the type of the card. You can then store this value in your database to save deriving it again the the future for query purposes).

ABA RTN with valid checksum but is test only

Not sure where to post this question... I'd like to know if test only ABA Routing and Transit Numbers (RTN) exist. In otherwords, the number will pass the checksum test, but it is for application testing only and will never be assigned to a financial institution by the registrar that manages the ABA RTNs.
I'm enhancing a web application that provides merchant services via credit card to also support echecks. For credit card test purposes I use VISA number 4111-1111-1111-1111 since that has been flagged as one of VISA's test card numbers and no credit charge will actually occur.
Is there an analogous number identified for ABA TRNs by its registrar? If not, does anyone know of one of the ABA RTNs that are available for future use and still not assigned (and unlikely to be assigned, kinda like using all 9 for SSN)?
The first two digits of 13-20, 33-60, 73-79, 81-99 are all not assigned per the Routing Number Policy, Section IV. Routing Number Structure (page 3).
Any 9 digit number that passes the CRC and starts with any numbers in the above ranges are pretty much guaranteed to be not usable in the real world.
411411411 is what I use for testing when I'm worried about leaking out to the real world. Otherwise 123123123 is easy to remember too.
Never did get an answer... What I ended up doing is using one of the reserved TRNs, specifically 440000000.

United States Banking Institution Account Number Regular Expression?

I have been tasked to "verify" the length of a U.S. Banking Institution ACCOUNT NUMBER for a web app I'm developing. I cannot find anything through SOF, Google, Fed reserve etc that outlines an account number standard length that we have in the United States. For the record I believe this is futile.
If someone could point me to any official documentation on the web, or has an example regular expression, or knows if there is a standard that exists, I would appreciate it greatly.
ADDED:
What would interest me even more since the response is overwhelming that their is no standard....has anyone ever run into a bank account number that is not completely "numeric"\
ADDED:
Thanks to everyone and their responses. Due to having no standard in the US, we are not going to enforce a length check, and we are going to store the number as a varchar due to the fact that it may be possible that a bank may assign alpha characters in their account numbers. Seems 99.999999% unrealistic in our view, but no standard means we will accept alpha characters and run the check on the account number to verify if it works or not. Thanks again all!
There is no standard for US banks' account numbers.
IBAN is not used in the US.
There is a limit for ACH transactions (4-17 digits), but not all transactions have to be ACH.
And yes, the US banking system is antiquated.
I'm looking at a DW (Data Warehouse) of 38 different systems at a bank and the length of account varies widely depending on the product. Several of the systems have alphabetic characters in the account numbers. This is probably irrelevant since they are special types of customer accounts like brokerage accounts and other things which aren't accessible through ACH - you need to specify what kind of account you're interested in. If you restrict yourself to accounts which you can get to through ACH, you can simply restrict to numeric digits.
You can get a lot more information about ACH at: http://www.nacha.org/
Good luck with that, because you can't.
Banks are free to use just about anything as an account number. I think the only validation you can do is whether or not they're numeric (as they all are).
The most common length for bank account numbers is 9, 12, or 10 digits. Although they range from 4 to 17 digits long. I have a large database of valid numbers and there's no pattern that I can see to the "account number".
A "routing number" defines the bank (pretty much) but even within a particular routing number, the account numbers can be of different lengths.
This is why payroll services often require an extra day (or two) before initiating Direct Deposit in order to "prenote" the account (validate it by performing a no-op ACH transaction) because you really can't verify it otherwise.
You can validate the routing number (or ABA) by downloading the DB (fixed field width text format) from the federal reserve bank. The data is here:
https://www.frbservices.org/EPaymentsDirectory/fpddir.txt and the layout describing the data is here:
https://www.frbservices.org/EPaymentsDirectory/fedwireFormat.html
There are companies (lyonslive.com) that offer a webservice to validate account numbers but they charge per validation (volume based pricing starting # 60 cents per check - if volume is high enough it can be as low as 24 cents).
Don't you mean International Bank Account Number? If yes, this is a regex for IBAN (all IBANs):
[a-zA-Z]{2}[0-9]{2}[a-zA-Z0-9]{4}[0-9]{7}([a-zA-Z0-9]?){0,16}
UPDATE: Actually, according to Wikipedia: Banks in the United States do not provide IBAN format account numbers. Any adoption of the IBAN standard by U.S. banks would likely be initiated by ANSI ASC X9, the U.S. financial services standards development organization but to date it has not done so. Hence payments to U.S. bank accounts from outside the U.S. are prone to errors of routing.
In Addition to the other great answers here, i think its helpful to know that routing numbers in the United States include a checksum digit which can be helpful for quick validation that the user typed it in correctly
http://www.brainjar.com/js/validation/
basically all US routing numbers should pass the following test:
3 * (digits[0] + digits[3] + digits[6]) +
7 * (digits[1] + digits[4] + digits[7]) +
(digits[2] + digits[5] + digits[8]) % 10 === 0
Very interesting. It seems like all routing/transit numbers are 9 digits.
I just checked American Express's online bill pay, for bank accounts they limit their field to 15 numerics. Chase limits theirs to 17. I would probably continue checking and maybe start to call a few banks to find out what their specifications are. It doesn't seem like there is a standard.
Another potential way to determine the length would be to ask the company that performs the transaction. Where does the account number get used? They should have specifications on what they will accept.
I don't think there is a standard - different institutions seem to use different lengths of account number. There probably is an upper limit - it is unlikely to be less than 20.
There is no standard for a bank account number in the US. There is a standard for the routing number, because that's shared between banks; the account number, however, is only of use internally by the bank itself.