../
2016-05-31: The /* Programming Comments */ documents have moved, and will no longer be updated or maintained at this location! Please update your bookmarks: http://www.ccoderun.ca/programming/.

Summary

The old regex.h method of using regular expressions with regcomp() and regexec() is not a great solution:

I'd much prefer to use std::regex from C++11, or the Boost equivalent boost::regex.

C++11 and Boost

Of all the C++11 features, regex support was possibly the last to come to GNU GCC. It didn't make it into GCC until v4.9 in April 2014.

For this reason, and since I also routinely need regular expressions in Windows using an older version of Microsoft Visual Studio as well as GCC in Linux, I've been using Boost::regex. There should be no difference between this example code with boost::regex versus C++11's std::regex.

The following example functions performs similar functionality to the regex.h code I pasted nearly 5 years ago:

#include <string> #include <vector> #include <boost/regex.hpp> typedef std::vector<std::string> VStr; /** Find out if a pattern exists in a string. * @return @p true if the pattern matches * @return @p false if the pattern doesn't match * @throw regex_error -> std::runtime_error -> std::exception */ bool my_regex_find( const std::string &str, const std::string &pattern ) { VStr groups; return my_regex_find( str, pattern, groups ); } /** Find a pattern in a string, and remember the groupings (if any). * @return @p true if the pattern matches * @return @p false if the pattern doesn't match * @throw regex_error -> std::runtime_error -> std::exception */ bool my_regex_find( const std::string &str, const std::string &pattern, VStr &groups ) { groups.clear(); boost::regex exp( pattern ); // default boost regex type is "PRE" (Perl Regular Expression) but this can be changed with a 2nd parm boost::smatch what; // string matches bool result = boost::regex_search( str, what, exp ); // also see boost::regex_match() // remember the groupings (if any) for ( size_t idx = 0; result && idx < what.size(); idx ++ ) { groups.push_back( std::string( what[idx].first, what[idx].second ) ); } return result; }

Several things worth pointing out:

Using my_regex_find()

Using this example function is quite simple.

if ( my_regex_find( "abc123xyz", "[0-9]" ) ... // this returns "true" (it matches the number "1") if ( my_regex_find( "abc123xyz", "[a-z]+" ) ... // this returns "true" (it matches "abc") if ( my_regex_find( "abc123xyz", "^[a-z]+$" ) ... // this returns "false" (it fails to match the entire string) VStr results; my_regex_find( "this is a test", "\\s([a-m]+)", results ); // this returns "true" and results[1] == "is" // results.size() will == 2 // results[0] is the entire match // results[1] is the first (and only) group

CMake

Getting Boost::regex into an existing CMake file is also relatively easy. For example:

... SET ( Boost_DEBUG 0 ) SET ( Boost_USE_STATIC_LIBS ON ) SET ( Boost_USE_MULTITHREADED ON ) SET ( Boost_USE_STATIC_RUNTIME OFF ) FIND_PACKAGE ( Boost REQUIRED COMPONENTS regex system ) FIND_PACKAGE ( Threads REQUIRED ) INCLUDE_DIRECTORIES ( AFTER ${Boost_INCLUDE_DIR} ) ... ADD_EXECUTABLE ( test test.cpp ) TARGET_LINK_LIBRARIES ( test ${CMAKE_THREAD_LIBS_INIT} ${Boost_LIBRARIES} )
Last modified: 2014-12-13
Stéphane Charette, stephanecharette@gmail.com
../