How to delete observations containing characters
The SAS Google Group had a question on how to delete observations containing characters (e.g. 92z89 or abcd) and only keep the ones with numeric values.
The thread was hijaked by another topic, but there were a number of good options submitted. Here is a summary of the suggestions:
data mydata;
input value $;
cards;
1023442
92z89
abcd
5231295
09CX42
9e122
12E3
98722
;
run;
*Suggestion #1 - simple deleting;
*Keep only numeric values;
data want1; set mydata;
if anyalpha(value)>0 then delete;
run;
*Suggestion #2 - using PRXMATCH function;
data want2; set mydata;
if prxmatch("/[a-zA-Z]/",value)=0;
/* Searches for a pattern match and returns the position at which the pattern is found */
run;
*Suggestion #3;
/*
Do you want to delete the letters?
Also, it's not a good idea to just simply delete data without knowing the value. You may find you want to keep
those observations. So why not try using the New Compress function in SAS9 or above and create a new variable?
*/
data want3a; set mydata;
var = Compress (value, ,"Kd") ;
*Kd is keep digits;
run;
*SAS 8;
data want3b; set mydata;
var = Compress (value, "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");
run;
*Suggestion #4;
/* How about 12e3? It contains an alpha character but is nevertheless valid as
the external representation of a numeric value?
If so, you need to keep numbers stated in
scientific notation, then try the following method:*/
%LET N_Errors = %SYSFUNC(GETOPTION(Errors));
OPTIONS ERRORS=0;
data want4; set mydata;
if missing(input(value,best12.)) then delete;
OPTIONS ERRORS=&N_Errors;
run;


Reader Comments