I'm writing a program which hashes passwords with the pbkdf2 method using cryptopp.
I have problems with validating the passwords. I have tried to compare the output in "length-constant" time but it always fails and returns false.
// a and b are std strings containing the output of the DeriveKey function
unsigned diff = a.length() ^ b.length();
for(unsigned i = 0; i < a.length() && i < b.length(); i++)
diff |= (unsigned)a[i] ^ (unsigned)b[i];
bool equal = diff == 0;
Is using "slow equals" even the right way to validate pbkdf2 passwords? I am a bit confused on this.

I'm writing a program which hashes passwords with the pbkdf2 method using cryptopp.
You linked to the Crypto++ main page, and not a your particular use of PBKDF. Here's some code just in case (it uses the IETF test vectors from RFC 6070):
int main(int argc, char* argv[])
byte password[] ="password";
size_t plen = strlen((const char*)password);
byte salt[] = "salt";
size_t slen = strlen((const char*)salt);
int c = 1;
byte derived[20];
PKCS5_PBKDF2_HMAC<CryptoPP::SHA1> pbkdf2;
pbkdf2.DeriveKey(derived, sizeof(derived), 0, password, plen, salt, slen, c);
string result;
HexEncoder encoder(new StringSink(result));
encoder.Put(derived, sizeof(derived));
cout << "Derived: " << result << endl;
return 0;
I have tried to compare the output in "length-constant" time but it always fails and returns false.
Crypto++ has a constant time compare built in. Use VerifyBufsEqual from misc.h. The source is available in misc.cpp.
$ cd cryptopp
$ grep -R VerifyBufsEqual *
cryptlib.cpp: return VerifyBufsEqual(digest, digestIn, digestLength);
default.cpp: if (!VerifyBufsEqual(check, check+BLOCKSIZE, BLOCKSIZE))
fipstest.cpp: if (!VerifyBufsEqual(expectedModuleMac, actualMac, macSize))
fipstest.cpp: if (VerifyBufsEqual(expectedModuleMac, actualMac, macSize))
misc.cpp:bool VerifyBufsEqual(const byte *buf, const byte *mask, size_t count)
misc.h:CRYPTOPP_DLL bool CRYPTOPP_API VerifyBufsEqual(const byte *buf1, const byte *buf2, size_t count);
pssr.cpp: valid = VerifyBufsEqual(representative + representativeByteLength - u, hashIdentifier.first, hashIdentifier.second) && valid;
pubkey.cpp: return VerifyBufsEqual(representative, computedRepresentative, computedRepresentative.size());
secblock.h: return m_size == t.m_size && VerifyBufsEqual(m_ptr, t.m_ptr, m_size*sizeof(T));
What I'm not clear about: VerifyBufsEqual is predicated upon buffers of equal lengths. I'm not sure if its OK to overlook the "not-equal length" case.
There's also a question on the Information Stack Exchange that may be relevant: Timing attacks on password hashes. But I'm not certain if/how it generalizes to arbitrary buffer compares.
The question piqued my interest in an answer to the general problem (the question has always been there): Constant time compares when array sizes are not equal?. That should tell us if we have the proper tools in VerifyBufsEqual (Crypto++), CRYPTO_memcmp (OpenSSL), etc.


C++ - OpenSSL secp256k1 ECDSA - Can't verify using exact copy of hash, whether using string literals or different objects

I am trying to sign a hash, create a copy of that hash (specifically client/server related), then verify the signature using the copy of the hash. I don't understand how or why when I have two variables, hash and hashCopy, with the exact same content, that verification can only work on one of the two depending on which one was signed.
The below code shows what I am talking about.
int main()
EC_KEY* myecc = NULL;
EVP_PKEY* pkey_ = NULL;
myecc = EC_KEY_new_by_curve_name(NID_secp256k1);
pkey_ = EVP_PKEY_new();
EVP_PKEY_assign_EC_KEY(pkey_, myecc);
myecc = EVP_PKEY_get1_EC_KEY(pkey_);
const EC_GROUP* ecgrp = EC_KEY_get0_group(myecc);
std::string hash = "126420fb81d58bbb7d86c98e1818f4c9d44acf216471e196ed403a608954cf1d";
std::string hashCopy = hash;
std::vector<unsigned char> signature;
unsigned char pchSig[10000];
unsigned int nSize = 0;
ECDSA_sign(0, (unsigned char*)&hash, sizeof(hash), pchSig, &nSize, myecc);
memcpy(&signature[0], pchSig, nSize);
if (ECDSA_verify(0, (unsigned char*)&hash, sizeof(hash), &signature[0], signature.size(), myecc) == 1)
std::cout << "Original hash variable verify success\n\n";
std::cout << "Original hash variable verify fail\n\n";
if (ECDSA_verify(0, (unsigned char*)&hashCopy, sizeof(hashCopy), &signature[0], signature.size(), myecc) == 1)
std::cout << "Hash copy variable verify success\n\n";
std::cout << "Hash copy variable verify fail\n\n";
This outputs:
Original hash variable verify success
Hash copy variable verify fail
Why does verifying using hashCopy fail given that it's the exact same? This variable basically represents the variable on the client side, whereas the original hash variable represents the variable on the server side.
Also, if I just pass through a string literal "1264..." which is the exact same as hash that does not work either.
I also tried this using a copy of the signature variable, that made no difference as I would expect.
How do I get the verification of both hash and hashCopy to work without needing to sign both?

Searching for an exact string match in a (arbitrary large) stream - C++

I am building a simple multi-server for string matching. I handle multiple clients at the same time by using sockets and select. The only job that the server does is this: a client connects to a server and sends a needle (of size less than 10 GB) and a haystack (of arbitrary size) as a stream through a network socket. Needle and haystack are an arbitrary binary data.
Server needs to search the haystack for all occurrences of the needle (as an exact string match) and sends a number of needle matches back to the client. Server needs to process clients on the fly and be able to handle any input in a reasonable time (that is a search algorithm have to have a linear time complexity).
To do this I obviously need to split the haystack into a small parts (possibly smaller than the needle) in order to process them as they are coming through the network socket. That is I would need a search algorithm that is able to handle a string, that is split into parts and search in it, the same way as strstr(...) does.
I could not find any standard C or C++ library function nor a Boost library object that could handle a string by parts. If I am not mistaken, algorithms in strstr(), string.find() and Boost searching/knuth_morris_pratt.hpp are only able to handle the search, when a whole haystack is in a continuous block of memory. Or is there some trick, that I could use to search a string by parts that I am missing? Do you guys know of any C/C++ library, that is able to cope with such a large needles and haystacks resp. that is able to handle haystack streams or search in haystack by parts?
I did not find any useful library by googling and hence I was forced to create my own variation of Knuth Morris Pratt algorithm, that is able to remember its own state (shown bellow). However I do not find it to be an optimal solution, as a well tuned string searching algorithm would surely perform better in my opinion, and it would be a less worry for a debugging later.
So my question is:
Is there some more elegant way to search in a large haystack stream by parts, other than creating my own search algorithm? Is there any trick how to use a standard C string library for this? Is there some C/C++ library that is specialized for a this kind of task?
Here is a (part of) code of my midified KMP algorithm:
#include <cstdlib>
#include <cstring>
#include <cstdio>
class knuth_morris_pratt {
const char* const needle;
const size_t needle_len;
const int* const lps; // a longest proper suffix table (skip table)
// suffix_len is an ofset of a longest haystack_part suffix matching with
// some prefix of the needle. suffix_len myst be shorter than needle_len.
// Ofset is defined as a last matching character in a needle.
size_t suffix_len;
size_t match_count; // a number of needles found in haystack
inline knuth_morris_pratt(const char* needle, size_t len) :
needle(needle), needle_len(len),
lps( build_lps_array() ), suffix_len(0),
match_count(len == 0 ? 1 : 0) { }
inline ~knuth_morris_pratt() { free((void*)lps); }
void search_part(const char* haystack_part, size_t hp_len); // processes a given part of the haystack stream
inline size_t get_match_count() { return match_count; }
const int* build_lps_array();
// Worst case complexity: linear space, linear time
// see:
// see article: KNUTH D.E., MORRIS (Jr) J.H., PRATT V.R., 1977, Fast pattern matching in strings
void knuth_morris_pratt::search_part(const char* haystack_part, size_t hp_len) {
if(needle_len == 0) {
match_count += hp_len;
const char* hs = haystack_part;
size_t i = 0; // index for txt[]
size_t j = suffix_len; // index for pat[]
while (i < hp_len) {
if (needle[j] == hs[i]) {
if (j == needle_len) {
// a needle found
j = lps[j - 1];
else if (i < hp_len && needle[j] != hs[i]) {
// Do not match lps[0..lps[j-1]] characters,
// they will match anyway
if (j != 0)
j = lps[j - 1];
i = i + 1;
suffix_len = j;
const int* knuth_morris_pratt::build_lps_array() {
int* const new_lps = (int*)malloc(needle_len);
// check_cond_fatal(new_lps != NULL, "Unable to alocate memory in knuth_morris_pratt(..)");
// length of the previous longest prefix suffix
size_t len = 0;
new_lps[0] = 0; // lps[0] is always 0
// the loop calculates lps[i] for i = 1 to M-1
size_t i = 1;
while (i < needle_len) {
if (needle[i] == needle[len]) {
new_lps[i] = len;
else // (pat[i] != pat[len])
// This is tricky. Consider the example.
// AAACAAAA and i = 7. The idea is similar
// to search step.
if (len != 0) {
len = new_lps[len - 1];
// Also, note that we do not increment
// i here
else // if (len == 0)
new_lps[i] = 0;
return new_lps;
int main()
const char* needle = "lorem";
const char* p1 = "sit voluptatem accusantium doloremque laudantium qui dolo";
const char* p2 = "rem ipsum quia dolor sit amet";
const char* p3 = "dolorem eum fugiat quo voluptas nulla pariatur?";
knuth_morris_pratt searcher(needle, strlen(needle));
searcher.search_part(p1, strlen(p1));
searcher.search_part(p2, strlen(p2));
searcher.search_part(p3, strlen(p3));
printf("%d \n", (int)searcher.get_match_count());
return 0;
You can have a look at BNDM, which has same performances as KMP:
O(m) for preprocessing
O(n) for matching.
It is used for nrgrep, the sources of which can be found here which containts C sources.
C source for BNDM algo are here.
See here for more information.
If I have well understood your problem, you want to search if a large std::string received part by part contains a substring.
If it is the case, I think you can store for each iteration the overlapping section between two contiguous received packets. And then you just have to check for each iteration that either the overlap or the packet contains the desired pattern to find.
In the example below, I consider the following contains() function to search a pattern in a std::string:
bool contains(const std::string & str, const std::string & pattern)
bool found(false);
if(!pattern.empty() && (pattern.length() < str.length()))
for(size_t i = 0; !found && (i <= str.length()-pattern.length()); ++i)
if((str[i] == pattern[0]) && (str.substr(i, pattern.length()) == pattern))
found = true;
return found;
std::string pattern("something"); // The pattern we want to find
std::string end_of_previous_packet(""); // The first part of overlapping section
std::string beginning_of_current_packet(""); // The second part of overlapping section
std::string overlap; // The string to store the overlap at each iteration
bool found(false);
while(!found && !all_data_received()) // stop condition
// Get the current packet
std::string packet = receive_part();
// Set the beginning of the current packet
beginning_of_current_packet = packet.substr(0, pattern.length());
// Build the overlap
overlap = end_of_previous_packet + beginning_of_current_packet;
// If the overlap or the packet contains the pattern, we found a match
if(contains(overlap, pattern) || contains(packet, pattern))
found = true;
// Set the end of previous packet for the next iteration
end_of_previous_packet = packet.substr(packet.length()-pattern.length());
Of course, in this example I made the assumption that the method receive_part() already exists. Same thing for the all_data_received() function. It is just an example to illustrate the idea.
I hope it will help you to find a solution.

Case insensitive sorting of an array of strings

Basically, I have to use selection sort to sort a string[]. I have done this part but this is what I am having difficulty with.
The sort, however, should be case-insensitive, so that "antenna" would come before "Jupiter". ASCII sorts from uppercase to lowercase, so would there not be a way to just swap the order of the sorted string? Or is there a simpler solution?
void stringSort(string array[], int size) {
int startScan, minIndex;
string minValue;
for(startScan = 0 ; startScan < (size - 1); startScan++) {
minIndex = startScan;
minValue = array[startScan];
for (int index = startScan + 1; index < size; index++) {
if (array[index] < minValue) {
minValue = array[index];
minIndex = index;
array[minIndex] = array[startScan];
array[startScan] = minValue;
C++ provides you with sort which takes a comparison function. In your case with a vector<string> you'll be comparing two strings. The comparison function should return true if the first argument is smaller.
For our comparison function we'll want to find the first mismatched character between the strings after tolower has been applied. To do this we can use mismatch which takes a comparator between two characters returning true as long as they are equal:
const auto result = mismatch(lhs.cbegin(), lhs.cend(), rhs.cbegin(), rhs.cend(), [](const unsigned char lhs, const unsigned char rhs){return tolower(lhs) == tolower(rhs);});
To decide if the lhs is smaller than the rhs fed to mismatch we need to test 3 things:
Were the strings of unequal length
Was string lhs shorter
Or was the first mismatched char from lhs smaller than the first mismatched char from rhs
This evaluation can be performed by:
result.second != rhs.cend() && (result.first == lhs.cend() || tolower(*result.first) < tolower(*result.second));
Ultimately, we'll want to wrap this up in a lambda and plug it back into sort as our comparator:
sort(foo.begin(), foo.end(), [](const unsigned char lhs, const unsigned char rhs){
const auto result = mismatch(lhs.cbegin(), lhs.cend(), rhs.cbegin(), rhs.cend(), [](const unsigned char lhs, const unsigned char rhs){return tolower(lhs) == tolower(rhs);});
return result.second != rhs.cend() && (result.first == lhs.cend() || tolower(*result.first) < tolower(*result.second));
This will correctly sort vector<string> foo. You can see a live example here:
Just saw your question update. You can use sort with string array[] as well. You'll just need to call it like this: sort(array, std::next(array, size), ...
#include <algorithm>
#include <vector>
#include <string>
using namespace std;
void CaseInsensitiveSort(vector<string>& strs)
[](const string& str1, const string& str2){
return lexicographical_compare(
begin(str1), end(str1),
begin(str2), end(str2),
[](const char& char1, const char& char2) {
return tolower(char1) < tolower(char2);
I use this lambda function to sort a vectors of strings:
std::sort(entries.begin(), entries.end(), [](const std::string& a, const std::string& b) -> bool {
for (size_t c = 0; c < a.size() and c < b.size(); c++) {
if (std::tolower(a[c]) != std::tolower(b[c]))
return (std::tolower(a[c]) < std::tolower(b[c]));
return a.size() < b.size();
Instead of the < operator, use a case-insensitive string comparison function.
C89/C99 provide strcoll (string collate), which does a locale-aware string comparison. It's available in C++ as std::strcoll. In some (most?) locales, like en_CA.UTF-8, A and a (and all accented forms of either) are in the same equivalence class. I think strcoll only compares within an equivalence class as a tiebreak if the whole string is otherwise equal, which gives a very similar sort order to a case-insensitive compare. Collation (at least in English locales on GNU/Linux) ignores some characters (like [). So ls /usr/share | sort gives output like
I pipe through sort because ls does its own sorting, which isn't quite the same as sort's locale-based sorting.
If you want to sort some user-input arbitrary strings into an order that the user will see directly, locale-aware string comparison is usually what you want. Strings that differ only in case or accents won't compare equal, so this won't work if you were using a stable sort and depending on case-differing strings to compare equal, but otherwise you get nice results. Depending on the use-case, nicer than plain case-insensitive comparison.
FreeBSD's strcoll was and maybe still is case sensitive for locales other than POSIX (ASCII). That forum post suggests that on most other systems it is not case senstive.
MSVC provides a _stricoll for case-insensitive collation, implying that its normal strcoll is case sensitive. However, this might just mean that the fallback to comparing within an equivalence class doesn't happen. Maybe someone can test the following example with MSVC.
// strcoll.c: show that these strings sort in a different order, depending on locale
#include <stdio.h>
#include <locale.h>
int main()
// TODO: try some strings containing characters like '[' that strcoll ignores completely.
const char * s[] = { "FooBar - abc", "Foobar - bcd", "FooBar - cde" };
setlocale(LC_ALL, ""); // empty string means look at env vars
strcoll(s[0], s[1]);
strcoll(s[0], s[2]);
strcoll(s[1], s[2]);
return 0;
output of gcc -DUSE_LOCALE -Og strcoll.c && ltrace ./a.out (or run LANG=C ltrace a.out):
__libc_start_main(0x400586, 1, ...
setlocale(LC_ALL, "") = "en_CA.UTF-8" # my env contains LANG=en_CA.UTF-8
strcoll("FooBar - abc", "Foobar - bcd") = -1
strcoll("FooBar - abc", "FooBar - cde") = -2
strcoll("Foobar - bcd", "FooBar - cde") = -1
# the three strings are in order
+++ exited (status 0) +++
with gcc -Og -UUSE_LOCALE strcoll.c && ltrace ./a.out:
__libc_start_main(0x400536, ...
# no setlocale, so current locale is C
strcoll("FooBar - abc", "Foobar - bcd") = -32
strcoll("FooBar - abc", "FooBar - cde") = -2
strcoll("Foobar - bcd", "FooBar - cde") = 32 # s[1] should sort after s[2], so it's out of order
+++ exited (status 0) +++
POSIX.1-2001 provides strcasecmp. The POSIX spec says the results are "unspecified" for locales other than plain-ASCII, though, so I'm not sure whether common implementations handle utf-8 correctly or not.
See this post for portability issues with strcasecmp, e.g. to Windows. See other answers on that question for other C++ ways of doing case-insensitive string compares.
Once you have a case-insensitive comparison function, you can use it with other sort algorithms, like C standard lib qsort, or c++ std::sort, instead of writing your own O(n^2) selection-sort.
As b.buchhold's answer points out, doing a case-insensitive comparison on the fly might be slower than converting everything to lowercase once, and sorting an array of indices. The lowercase-version of each strings is needed many times. std::strxfrm will transform a string so that strcmp on the result will give the same result as strcoll on the original string.
You could call tolower on every character you compare. This is probably the easiest, yet not a great solution, becasue:
You look at every char multiple times so you'd call the method more often than necessary
You need extra care to handle wide-characters w.r.t to their encoding (UTF8 etc)
You could also replace the comparator by your own function. I.e. there will be some place where you compare something like stringone[i] < stringtwo[j] or charA < charB. change it to my_less_than(stringone[i], stringtwo[j]) and implement the exact ordering you want based.
another way would be to transform every string to lowercase once and create an array of pairs. then you base your comparisons on the lowercase value only, but you swap whole pairs so that your final strings will be in the right order as well.
finally, you can create an array with lowercase versions and sort this one. whenever you swap two elements in this one, you also swap in the original array.
note that all those proposals would still need proper handling of wide characters (if you need that at all)
This solution is much simpler to understand than Jonathan Mee's and pretty inefficient, but for educational purpose could be fine:
std::string lowercase( std::string s )
std::transform( s.begin(), s.end(), s.begin(), ::tolower );
return s;
std::sort( array, array + length,
[]( const std::string &s1, const std::string &s2 ) {
return lowercase( s1 ) < lowercase( s2 );
} );
if you have to use your sort function, you can use the same approach:
minValue = lowercase( array[startScan] );
for (int index = startScan + 1; index < size; index++) {
const std::string &tstr = lowercase( array[index] );
if (tstr < minValue) {
minValue = tstr;
minIndex = index;

Character pointers messed up in simple Boyer-Moore implementation

I am currently experimenting with a very simple Boyer-Moore variant.
In general my implementation works, but if I try to utilize it in a loop the character pointer containing the haystack gets messed up. And I mean that characters in it are altered, or mixed.
The result is consistent, i.e. running the same test multiple times yields the same screw up.
This is the looping code:
string src("This haystack contains a needle! needless to say that only 2 matches need to be found!");
string pat("needle");
const char* res = src.c_str();
while((res = boyerMoore(res, pat)))
This is my implementation of the string search algorithm (the above code calls a convenience wrapper which pulls the character pointer and length of the string):
unsigned char*
boyerMoore(const unsigned char* src, size_t srcLgth, const unsigned char* pat, size_t patLgth)
if(srcLgth < patLgth || !src || !pat)
return nullptr;
size_t skip[UCHAR_MAX]; //this is the skip table
for(int i = 0; i < UCHAR_MAX; ++i)
skip[i] = patLgth; //initialize it with default value
for(size_t i = 0; i < patLgth; ++i)
skip[(int)pat[i]] = patLgth - i - 1; //set skip value of chars in pattern
std::cout<<src<<"\n"; //just to see what's going on here!
size_t srcI = patLgth - 1; //our first character to check
while(srcI < srcLgth)
size_t j = 0; //char match ct
while(j < patLgth)
if(src[srcI - j] == pat[patLgth - j - 1])
//since the number of characters to skip may be negative, I just increment in that case
size_t t = skip[(int)src[srcI - j]];
if(t > j)
srcI = srcI + t - j;
if(j == patLgth)
return (unsigned char*)&src[srcI + 1 - j];
return nullptr;
The loop produced this output (i.e. these are the haystacks the algorithm received):
This haystack contains a needle! needless to say that only 2 matches need to be found!
eedle! needless to say that only 2 matches need to be found!
eedless to say that eed 2 meed to beed to be found!
As you can see the input is completely messed up after the second run. What am I missing? I thought the contents could not be modified, since I'm passing const pointers.
Is the way of setting the pointer in the loop wrong, or is my string search screwing up?
Btw: This is the complete code, except for includes and the main function around the looping code.
The missing nullptr of the first return was due to a copy/paste error, in the source it is actually there.
For clarification, this is my wrapper function:
inline char* boyerMoore(const string &src, const string &pat)
return (const char*) boyerMoore((const unsigned char*) src.c_str(), src.size(),
(const unsigned char*) pat.c_str(), pat.size());
In your boyerMoore() function, the first return isn't returning a value (you have just return; rather than return nullptr;) GCC doesn't always warn about missing return values, and not returning anything is undefined behavior. That means that when you store the return value in res and call the function again, there's no telling what will print out. You can see a related discussion here.
Also, you have omitted your convenience function that calculates the length of the strings that you are passing in. I would recommend double checking that logic to make sure the sizes are correct - I'm assuming you are using strlen or similar.

Fastest way to do a case-insensitive substring search in C/C++?

The question below was asked in 2008 about some code from 2003. As the OP's update shows, this entire post has been obsoleted by vintage 2008 algorithms and persists here only as a historical curiosity.
I need to do a fast case-insensitive substring search in C/C++. My requirements are as follows:
Should behave like strstr() (i.e. return a pointer to the match point).
Must be case-insensitive (doh).
Must support the current locale.
Must be available on Windows (MSVC++ 8.0) or easily portable to Windows (i.e. from an open source library).
Here is the current implementation I am using (taken from the GNU C Library):
/* Return the offset of one string within another.
Copyright (C) 1994,1996,1997,1998,1999,2000 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307 USA. */
* My personal strstr() implementation that beats most other algorithms.
* Until someone tells me otherwise, I assume that this is the
* fastest implementation of strstr() in C.
* I deliberately chose not to comment it. You should have at least
* as much fun trying to understand it, as I had to write it :-).
* Stephen R. van den Berg, */
* Modified to use table lookup instead of tolower(), since tolower() isn't
* worth s*** on Windows.
* -- Anders Sandvig (
# include <config.h>
#include <ctype.h>
#include <string.h>
typedef unsigned chartype;
char char_table[256];
void init_stristr(void)
int i;
char string[2];
string[1] = '\0';
for (i = 0; i < 256; i++)
string[0] = i;
char_table[i] = string[0];
#define my_tolower(a) ((chartype) char_table[a])
char *
my_stristr (phaystack, pneedle)
const char *phaystack;
const char *pneedle;
register const unsigned char *haystack, *needle;
register chartype b, c;
haystack = (const unsigned char *) phaystack;
needle = (const unsigned char *) pneedle;
b = my_tolower (*needle);
if (b != '\0')
haystack--; /* possible ANSI violation */
c = *++haystack;
if (c == '\0')
goto ret0;
while (my_tolower (c) != (int) b);
c = my_tolower (*++needle);
if (c == '\0')
goto foundneedle;
goto jin;
for (;;)
register chartype a;
register const unsigned char *rhaystack, *rneedle;
a = *++haystack;
if (a == '\0')
goto ret0;
if (my_tolower (a) == (int) b)
a = *++haystack;
if (a == '\0')
goto ret0;
while (my_tolower (a) != (int) b);
a = *++haystack;
if (a == '\0')
goto ret0;
if (my_tolower (a) != (int) c)
goto shloop;
rhaystack = haystack-- + 1;
rneedle = needle;
a = my_tolower (*rneedle);
if (my_tolower (*rhaystack) == (int) a)
if (a == '\0')
goto foundneedle;
a = my_tolower (*++needle);
if (my_tolower (*rhaystack) != (int) a)
if (a == '\0')
goto foundneedle;
a = my_tolower (*++needle);
while (my_tolower (*rhaystack) == (int) a);
needle = rneedle; /* took the register-poor approach */
if (a == '\0')
return (char*) haystack;
return 0;
Can you make this code faster, or do you know of a better implementation?
Note: I noticed that the GNU C Library now has a new implementation of strstr(), but I am not sure how easily it can be modified to be case-insensitive, or if it is in fact faster than the old one (in my case). I also noticed that the old implementation is still used for wide character strings, so if anyone knows why, please share.
Just to make things clear—in case it wasn't already—I didn't write this function, it's a part of the GNU C Library. I only modified it to be case-insensitive.
Also, thanks for the tip about strcasestr() and checking out other implementations from other sources (like OpenBSD, FreeBSD, etc.). It seems to be the way to go. The code above is from 2003, which is why I posted it here in hope for a better version being available, which apparently it is. :)
The code you posted is about half as fast as strcasestr.
$ gcc -Wall -o my_stristr my_stristr.c
$ gcc -Wall -o strcasestr strcasestr.c
$ ./bench ./my_stristr > my_stristr.result ; ./bench ./strcasestr > strcasestr.result;
$ cat my_stristr.result
run 1... time = 6.32
run 2... time = 6.31
run 3... time = 6.31
run 4... time = 6.31
run 5... time = 6.32
run 6... time = 6.31
run 7... time = 6.31
run 8... time = 6.31
run 9... time = 6.31
run 10... time = 6.31
average user time over 10 runs = 6.3120
$ cat strcasestr.result
run 1... time = 3.82
run 2... time = 3.82
run 3... time = 3.82
run 4... time = 3.82
run 5... time = 3.82
run 6... time = 3.82
run 7... time = 3.82
run 8... time = 3.82
run 9... time = 3.82
run 10... time = 3.82
average user time over 10 runs = 3.8200
The main function was:
int main(void)
char * needle="hello";
char haystack[1024];
int i;
memcpy(haystack+i,needle, strlen(needle)+1);
/*printf("%s\n%d\n", haystack, haystack[strlen(haystack)]);*/
for (i=0;i<1000000;++i)
/*my_stristr(haystack, needle);*/
return 0;
It was suitably modified to test both implementations. I notice as I am typing this up I left in the init_stristr call, but it shouldn't change things too much. bench is just a simple shell script:
function bc_calc()
echo $(echo "scale=4;$1" | bc)
time="/usr/bin/time -p"
for a in $(jot $runs 1 $runs)
echo -n "run $a... "
t=$($time $prog 2>&1| grep user | awk '{print $2}')
echo "time = $t"
accum=$(bc_calc "$accum+$t")
echo -n "average user time over $runs runs = "
echo $(bc_calc "$accum/$runs")
You can use StrStrI function which finds the first occurrence of a substring within a string. The comparison is not case-sensitive.
Don't forget to include its header - Shlwapi.h.
Check this out:
use boost string algo. It is available, cross platform, and only a header file (no library to link in). Not to mention that you should be using boost anyway.
#include <boost/algorithm/string/find.hpp>
const char* istrstr( const char* haystack, const char* needle )
using namespace boost;
iterator_range<char*> result = ifind_first( haystack, needle );
if( result ) return result.begin();
return NULL;
For platform independent use:
const wchar_t *szk_wcsstri(const wchar_t *s1, const wchar_t *s2)
if (s1 == NULL || s2 == NULL) return NULL;
const wchar_t *cpws1 = s1, *cpws1_, *cpws2;
char ch1, ch2;
bool bSame;
while (*cpws1 != L'\0')
bSame = true;
if (*cpws1 != *s2)
ch1 = towlower(*cpws1);
ch2 = towlower(*s2);
if (ch1 == ch2)
bSame = true;
if (true == bSame)
cpws1_ = cpws1;
cpws2 = s2;
while (*cpws1_ != L'\0')
ch1 = towlower(*cpws1_);
ch2 = towlower(*cpws2);
if (ch1 != ch2)
if (*cpws2 == L'\0')
return cpws1_-(cpws2 - s2 - 0x01);
return NULL;
Why do you use _strlwr(string); in init_stristr()? It's not a standard function. Presumably it's for locale support, but as it's not standard, I'd just use:
char_table[i] = tolower(i);
I'd advice you to take some of the common strcasestr implementation that already exists. For example of glib, glibc, OpenBSD, FreeBSD, etc. You can search for more with You can then make some performance measurements and compare the different implementation.
Assuming both input strings are already lowercase.
int StringInStringFindFirst(const char* p_cText, const char* p_cSearchText)
int iTextSize = strlen(p_cText);
int iSearchTextSize = strlen(p_cSearchText);
char* p_cFound = NULL;
if(iTextSize >= iSearchTextSize)
int iCounter = 0;
while((iCounter + iSearchTextSize) <= iTextSize)
if(memcmp( (p_cText + iCounter), p_cSearchText, iSearchTextSize) == 0)
return iCounter;
iCounter ++;
return -1;
You could also, try using masks... if for example most of the strings you are going to compare only contains chars from a to z, maybe it's worth to do something like this.
long GetStringMask(const char* p_cText)
long lMask=0;
while(*p_cText != '\0')
if (*p_cText>='a' && *p_cText<='z')
lMask = lMask | (1 << (*p_cText - 'a') );
else if(*p_cText != ' ')
lMask = 0;
p_cText ++;
return lMask;
int main(int argc, char* argv[])
char* p_cText = "this is a test";
char* p_cSearchText = "test";
long lTextMask = GetStringMask(p_cText);
long lSearchMask = GetStringMask(p_cSearchText);
int iFoundAt = -1;
// If Both masks are Valid
if(lTextMask != 0 && lSearchMask != 0)
if((lTextMask & lSearchMask) == lSearchMask)
iFoundAt = StringInStringFindFirst(p_cText, p_cSearchText);
iFoundAt = StringInStringFindFirst(p_cText, p_cSearchText);
return 0;
This will not consider the locale, but If you can change the IS_ALPHA and TO_UPPER you can make it to consider it.
#define IS_ALPHA(c) (((c) >= 'A' && (c) <= 'Z') || ((c) >= 'a' && (c) <= 'z'))
#define TO_UPPER(c) ((c) & 0xDF)
char * __cdecl strstri (const char * str1, const char * str2){
char *cp = (char *) str1;
char *s1, *s2;
if ( !*str2 )
return((char *)str1);
while (*cp){
s1 = cp;
s2 = (char *) str2;
while ( *s1 && *s2 && (IS_ALPHA(*s1) && IS_ALPHA(*s2))?!(TO_UPPER(*s1) - TO_UPPER(*s2)):!(*s1-*s2))
++s1, ++s2;
if (!*s2)
If you want to shed CPU cycles, you might consider this - let's assume that we're dealing with ASCII and not Unicode.
Make a static table with 256 entries. Each entry in the table is 256 bits.
To test whether or not two characters are equal, you do something like this:
if (BitLookup(table[char1], char2)) { /* match */ }
To build the table, you set a bit everywhere in table[char1] where you consider it a match for char2. So in building the table you would set the bits at the index for 'a' and 'A' in the 'a'th entry (and the 'A'th entry).
Now this is going to be slowish to do the bit lookup (bit look up will be a shift, mask and add most likely), so you could use instead a table of bytes so you use 8 bits to represent 1 bit. This will take 32K - so hooray - you've hit a time/space trade-off! We might want to make the table more flexible, so let's say we do this instead - the table will define congruences instead.
Two characters are considered congruent if and only if there is a function that defines them as equivalent. So 'A' and 'a' are congruent for case insensitivity. 'A', 'À', 'Á' and 'Â' are congruent for diacritical insensitivity.
So you define bitfields that correspond to your congruencies
#define kCongruentCase (1 << 0)
#define kCongruentDiacritical (1 << 1)
#define kCongruentVowel (1 << 2)
#define kCongruentConsonant (1 << 3)
Then your test is something like this:
inline bool CharsAreCongruent(char c1, char c2, unsigned char congruency)
return (_congruencyTable[c1][c2] & congruency) != 0;
#define CaseInsensitiveCharEqual(c1, c2) CharsAreCongruent(c1, c2, kCongruentCase)
This kind of bit fiddling with ginormous tables is the heart of ctype, by the by.
If you can control the needle string so that it is always in lower case, then you can write a modified version of stristr() to avoid the lookups for that, and thus speed up the code. It isn't as general, but it can be faster - slightly faster. Similar comments apply to the haystack, but you are more likely to be reading the haystack from sources outside your control for you cannot be certain that the data meets the requirement.
Whether the gain in performance is worth it is another question altogether. For 99% of applications, the answer is "No, it is not worth it". Your application might be one of the tiny minority where it matters. More likely, it is not.