How to Import in SAS
Share
We can import any file to SAS by using PROC IMPORT procedure. It is used to read external files in SAS to read and store in SAS.
The syntax of PROC IMPORT is :
PROC IMPORT
DATAFILE=”filename”
OUT=SAS-data-set
DBMS=identifier
REPLACE;
SHEET=”Sheet-name”;
GETNAMES=YES;
DATAROW=N;
RANGE=”range-name”;
Namerow=N;
startrow=N;
RUN;
1. DATAFILE = option tells SAS where to find the external file that you want to import . It shows the file path from where we should import it.
For example, DATAFILE=” D:\sas\abc.xls “
2. OUT= option tells SAS to create a dataset with any name of your choice . By default , the dataset is save on WORK library . We can assign permanent library name to save the dataset in library.
For example , OUT = abc
In this statement , the dataset stored as “abc” in WORK library.
OUT=data.abc
In this statement , the dataset stored permanently in data library.
3. DBMS = option tells SAS the type of file to read .
For example , DBMS= XLS for file have format “.xls ” .
DBMS= CSV for Comma Separated Value data .
4. REPLACE is used to overwrite the existing SAS dataset (if stored ) in the OUT= option
5. SHEET = option is used to specify which sheet SAS would import in excel file.
For example ,
SHEET = “Sheet1” – To import data from worksheet named sheet1 .
SHEET= “Data” – To import data from worksheet named Data.
6. GETNAMES = option tells SAS to use the first row as variable names.
By default , PROC IMPORT uses GETNAMES = YES . We can change GETNAMES = NO , SAS would not read variable names from first row of the sheet.
7. DATAROW = option is used to specify starting row from where SAS would import the data.
For example , DATAROW = 5 tells SAS to start reading data from row number 5.
8. RANGE= option is used to specify which range SAS would import .
For example,
RANGE= “Sheet1$B3:D25” – Tell SAS to import data from range B3 : D25 from Sheeet1.
RANGE = “Database” – Tell SAS to import data from excel defined name range . The name range used is Database .
First , we upload the file we want to import in SAS . We click on the upload button to upload file in satya folder .
We upload bankdata.txt file in satya folder as :
We want to import text file in SAS . We have a file “bankdata.txt” to import in SAS . The data is look like :
We want to find the location of file as :
First , we right click on bankdata.txt file to view Properties as :
We click on Properties to view location of file as :
The data is stored as tab delimited data. We used following statements to import our dataset as :
datafile = “/home/satyamodi0/satya/bankdata.txt” tells SAS about the path of bankdata.txt to import.
out= work.abc tells SAS to create a dataset named abc stored in work library .
dbms = tab tells SAS the tab delimited file to read.
Replace is used to overwrite the abc dataset if it exists already .
getnames = no tells SAS would not read first row as variable names .
We used following code to import dataset in SAS :
proc import datafile=”/home/satyamodi0/satya/bankdata.txt” out=work.abc dbms=tab replace;
getnames=no;
run;
OUTPUT :
As , we do not have variables name in the data. So , SAS assign VAR1 , VAR2 and VAR3 as variable names by default.
We want to import space delimited data in SAS . We used extra statement as :
dbms = dlm tells SAS to the delimited file to read.
delimiter =” ” tells SAS to read space delimited data .
The bankdata.txt dataset as :
We used following code to import dataset as :
proc import datafile=”/home/satyamodi0/satya/bankdata.txt” out=work.abc dbms=dlm replace;
delimiter=” “;
getnames=no;
run;
OUTPUT:
We want to import CSV(Comma Separated Value ) data. The data is separated by comma(,) . The dataset is look like :
The dataset have a missing in third row and third column . The output shows period(.) to represent missing in the data.
We used dbms = csv to tell SAS to read CSV file format .
We can used following code to import mydata.csv file :
proc import datafile=”/home/satyamodi0/satya/mydata.csv” out=data dbms=csv replace;
getnames=no;
run;
OUTPUT:
We have a delimited data as :
We want to import this dataset by using delimiter = “$” to read it . We have extra $ in second line . The new column create which represents blank in the output as the $ represent missing in data.
proc import datafile=”/home/satyamodi0/satya/company.txt” out=data dbms=dlm replace;
delimiter=”$”;
getnames=no;
run;
OUTPUT:
We used same data but remove extra $ from second line . The output resembles with the dataset .
We used following code to import company.txt data as:
proc import datafile=”/home/satyamodi0/satya/company.txt” out=data dbms=dlm replace;
delimiter=”$”;
getnames=no;
run;
OUTPUT:
We have five columns ID , Collision , HardBrake , Biz_Type and Z_Drug in our dataset :
We import dataset as :
proc import datafile=”/home/satyamodi0/satya/Trucker_hw3_add.csv” out=trucker dbms=csv replace;
run;
OUTPUT:
The dataset is look like as :
We used following statements to import dataset as :
datafile=”/home/satyamodi0/satya/drugtest.xls” tells SAS where to find Excel file that you want to import.
out=my tells SAS to store out in work library .
dbms=xls tells SAS the XLS file format to read.
getnames= yes tells SAS to read name of varibales as the first row of data.
We want to import excel data drugtest.xls by using following code :
proc import datafile=”/home/satyamodi0/satya/drugtest.xls” out=my dbms=xls replace;
getnames=yes;
run;
OUTPUT:
We have the following dataset to import :
We used following code to import commute.xls file as:
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my dbms=xls replace;
run;
OUTPUT:
We used DATAROW = 4 tells SAS to start reading data from row number 4 . In this case , variable names would pull from first row but data extracted from row 4 .
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my dbms=xls replace;
datarow=4;
run;
OUTPUT:
We used endrow =45 to stop reading when row number comes 45 . So, SAS reads data up to row number 45 . We used following code :
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my dbms=xls replace;
endrow=45;
run;
OUTPUT:
We used following statement :
datarow = 10 tells SAS to start reading data from row number 10
endrow = 150 tells SAS to stop reading when row number 150 comes .
We used following code to import commute.xls data :
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my dbms=xls replace;
datarow=10;
endrow=150;
run;
We can see from the dataset , SAS reads from row number 10 as :
OUTPUT:
We can also read data from different sheets in an excel file . We have data stored in data worksheet of commute.xls file as :
We used following statements as :
sheet = “data” tells SAS to import data from data .
The following code is used to import data from data worksheet in commute.xls data :
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my dbms=xls replace;
sheet=”data”;
run;
OUTPUT:
We have used following statements as :
range=”data$A1:C125″ tells SAS to import data from range A1:C125 from data worksheet.
The following code used to read ranged data as:
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my dbms=xls replace;
getnames=yes;
range=”data$A1:C125″;
run;
OUTPUT:
We used Range=” data$B1:C1250″ to import data from B1 to C1250 . It stores 1249 rows and 2 columns. The first column is name of variables .
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my dbms=xls replace;
getnames=yes;
range=”data$B1:C1250″;
run;
OUTPUT:
The following file shows , we have named range “Abc” to import in SAS .
We used following code to import named range in SAS :
range = “Abc” tells SAS to read data from named range .
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my dbms=xls replace;
range=”Abc”;
run;
OUTPUT:
We can also import Excel files of 2013 and above . We have following data as :
We used following statements as :
dbms = xlsx tells SAS to read the XLSX file format .
The following code is used to import my.xlsx file :
proc import datafile=”/home/satyamodi0/satya/my.xlsx” out=my dbms=xlsx replace;
run;
OUTPUT:
We used following statements :
Namerow = 1 tells SAS to read variables from first row of data.
Startrow = 19 tells SAS to read data from row number 19.
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my dbms=xls replace;
Namerow=1;
Startrow=19;
run;
OUTPUT:
We used following statements :
startcol=2 tells SAS to import data from column number 2 .
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my dbms=xls replace;
getnames=no;
Namerow=1;
startcol=2;
Range=”Sheet1$B12:D185″;
run;
OUTPUT:
We used Keep = Time Depart to retain Time and Depart variable in my dataset .
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my(keep= Time Depart) dbms=xls replace;
getnames=yes;
run;
OUTPUT:
We used drop= to drop Time and Depart from my dataset.
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my(drop= Time Depart) dbms=xls replace;
getnames=yes;
run;
OUTPUT:
We can also rename variables by using following statement :
out=my(rename=(Reds=temp)) tells SAS to rename Reds to temp and store in my dataset .
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my(rename=(Reds= temp)) dbms=xls replace;
getnames=yes;
run;
OUTPUT:
We can also apply condition while storing in SAS. We used following statement :
out= my(where=(time>42)) tells SAS to store data having time greater than 42 value .
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my(where=(time > 42)) dbms=xls replace;
getnames=yes;
run;
OUTPUT:
We stored data that contains time greater than 42 and depart less than 25 :
proc import datafile=”/home/satyamodi0/satya/commute.xls” out=my(where=(time > 42 and depart <25)) dbms=xls replace;
getnames=yes;
run;
OUTPUT: