Perl FAQ 4.27: How do I use a regular expression to strip C style comments from a

Perl FAQ 4.27

How do I use a regular expression to strip C style comments from a

file?

Since we're talking about how to strip comments under perl5, now is a good time to talk about doing it in perl4. Since comments can be embedded in strings, or look like function prototypes, care must be taken to ignore these cases. Jeffrey Friedl* proposes the following two programs to strip C comments and C++ comments respectively: C comments:

#!/usr/bin/perl
$/ = undef;
$_ = <>; 

s#/\*[^*]*\*+([^/*][^*]*\*+)*/|([^/"']*("[^"\\]*(\\[\d\D][^"\\]*)*"[^/"']*|'[^'\\]*(\\[\d\D][^'\\]*)*'[^/"']*|/+[^*/][^/"']*)*)#$2#g;
print; 

C++ comments:

#!/usr/local/bin/perl
$/ = undef;
$_ = <>;
s#//(.*)|/\*[^*]*\*+([^/*][^*]*\*+)*/|"(\\.|[^"\\])*"|'(\\.|[^'\\])*'|[^/"']+#  $1 ? "/*$1 */" : $& #ge;
print;

(Yes, Jeffrey says, those are complete programs to strip comments correctly.)


Other resources at this site: