So let’s start...this is my first post so feeling a bit excited.There are some questions which couldn't solve ,so kindly do suggest me a way out for those questions.
Conditional Formats and Labels
*Question1
Run the program here to create a temporary SAS data set called
Voter:
data voter;
input Age Party : $1. (Ques1-Ques4)($1. + 1);
datalines;
23 D 1 1 2 2
45 R 5 5 4 1
67 D 2 4 3 3
39 R 4 4 4 4
19 D 2 1 2 1
75 D 3 3 2 3
57 R 4 3 4 4
;
Add formats for Age (0–30, 31–50, 51–70, 71+), Party (D =
Democrat, R =
Republican), and Ques1–Ques4 (1=Strongly Disagree, 2=Disagree,
3=No
Opinion, 4=Agree, 5=Strongly Agree). In addition, label
Ques1–Ques4 as
follows:
Ques1 The president is doing a good job
Ques2 Congress is doing a good job
Ques3 Taxes are too high
Ques4 Government should cut spending
;
Ans
proc format;
value umar 0-30 = 'less than 30'
30-50 = '30 to 50'
50-70 = '50 to 70'
70-high = 'above 70';
value $politics 'D' = 'Democrat'
'R' = 'Republican';
value $likert '1' = 'Strongly Disagree'
'2'='Disagree'
'3'='No Opinion'
'4'='Agree'
'5'='Strongly Agree';
run;
data voter;
infile datalines;
input Age Party$ (Ques1-Ques4)($1.+1);
datalines;
23 D 1 1 2 2
45 R 5 5 4 1
67 D 2 4 3 3
39 R 4 4 4 4
19 D 2 1 2 1
75 D 3 3 2 3
57 R 4 3 4 4
;
run;
PROC PRINT
noobs;
RUN;
proc freq;
format Age
umar.
Party
$politics.
Ques1-Ques4
$likert.;
label
Ques1 = 'The president is doing a good job'
Ques2 =
'Congress is doing a good job'
Ques3 =
'Taxes are too high'
Ques4 =
'Government should cut spending';
run;
*Question2
You want to
see frequencies for Questions 1 to 4 from the previous question.
However, you
want only three categories: Generally Disagree (combine
Strongly
Disagree and Disagree),
No Opinion, and Generally
Agree
(combine Agree
and Strongly Agree).
Accomplish this using a new format for
Ques1–Ques4.;
Ans
proc format;
value umar 0-<30 = 'less than 30'
30-<50 = '30 to 50'
50-<70 = '50 to 70'
70-high = 'above 70';
value $politics 'D' = 'Democrat'
'R' = 'Republican';
value $likert '1','2'='Generally Disagree'
'3'='No Opinion'
'4','5'='Generally
Agree';
run;
data voter;
infile datalines;
input Age Party$ (Ques1-Ques4)($1.+1);
datalines;
23 D 1 1 2 2
45 R 5 5 4 1
67 D 2 4 3 3
39 R 4 4 4 4
19 D 2 1 2 1
75 D 3 3 2 3
57 R 4 3 4 4
;
run;
PROC PRINT noobs;
RUN;
proc freq;
format Age umar.
Party $politics.
Ques1-Ques4 $likert.;
label Ques1 = 'The president is doing a good job'
Ques2 = 'Congress is doing a good job'
Ques3 = 'Taxes are too high'
Ques4 = 'Government should cut spending';
run;
*Question 3
Run the
following program to create a SAS data set called Colors (see Chapter 21 for
a discussion
of the double at signs [@@] in the INPUT statement):
data colors;
input Color : $1. @@;
datalines;
R R B G Y Y . . B G R B
G Y P O O V V B
;
Ans
proc format;
value $col 'R','B','G' = 'Group 1'
'Y','O' = 'Group 2'
'.' = 'Not Given'
other = 'Group 3'
run;
data colors;
input Color : $1. @@;
datalines;
R R B G Y Y . . B G R B G Y P O O V V B
;run;
proc freq;
format Color $col.;
run;
Performing Conditional Processing
*Question1
Run the
program here to create a temporary SAS data set called School:
data school;
input Age Quiz : $1.
Midterm Final;
/* Add you statements
here */
datalines;
12 A 92 95
12 B 88 88
13 C 78 75
13 A 92 93
12 F 55 62
13 B 88 82
;
Using IF and
ELSE IF statements, compute two new variables as follows: Grade
(numeric),
with a value of 6 if Age is 12
and a value of 8 if
Age is 13.
The quiz
grades have numerical equivalents as follows: A = 95,
B = 85, C = 75,
D
= 70, and F = 65.
Using this information, compute a course grade (Course) as a
weighted average of the Quiz
(20%), Midterm (30%) and Final (50%).;
Ans
data school;
input Age Quiz : $1. Midterm Final;
if Age eq 12 then Grade= 6;
else if Age eq 13 then Grade = 8;
if Quiz eq 'A' then Course =
((0.2*95)+(0.3*Midterm)+(0.5*Final));
else if Quiz eq 'B' then Course =
((0.2*75)+(0.3*Midterm)+(0.5*Final));
else if Quiz eq 'C' then Course = ((0.2*70)+(0.3*Midterm)+(0.5*Final));
else if Quiz eq 'F' then Course =
((0.2*65)+(0.3*Midterm)+(0.5*Final));
datalines;
12 A 92 95
12 B 88 88
13 C 78 75
13 A 92 93
12 F 55 62
13 B 88 82
;
run;
proc print noobs;
run;
*Question3
Using the
Sales data set, list the observations for employee numbers (EmpID) 9888
and 0177.
Do this two ways, one using OR operators and the other using the IN
operator.;
Ans
Ans
proc import datafile=
'/folders/myfolders/Practice/Sales.xls' dbms=xls out=Sales Replace;
run;
proc print;
run;
proc print data=Sales;
where EmpID eq '9888' or EmpID eq '0177';
run;
proc print data=Sales;
where EmpID in ('9888' '0177');
run;
*Question4
Using the
Sales data set, create a new, temporary SAS data set containing Region
and TotalSales
plus a new variable called Weight with values of 1.5
for the North
Region, 1.7
for the South Region, and 2.0 for
the West and East Regions. Use a
SELECT statement to do this;
Ans
data NewFile;
set Sales;
if Region eq 'North' then Weight = 1.5;
else if Region eq 'South' then Weight = 1.7;
else Weight = 2.0;
run;
proc print noobs;
var Region TotalSales Weight;
run;
*Question5
Starting with
the Blood data set, create a new, temporary SAS data set containing
all the
variables in Blood plus a new variable called CholGroup. Define this new
variable as
follows:
CholGroup
Chol
Low
Low – 110
Medium
111 – 140
High
141 – High
Use a SELECT
statement to do this.;
Ans.
data
BloodData;
infile
'/folders/myfolders/Practice/blood.txt';
input ID$
Gender$ Bloodgroup$ Age$ WBC RBC Chol;
if Chol le 110
and not missing(Chol) then CholGroup='Low-110';
else if Chol
ge 111 and chol le 140 then CholGroup='111-140';
else if Chol
ge 141 then CholGroup='141-high';
run;
proc print
noobs;
run;
*Question6
Using the
Sales data set, list all the observations where Region is North
and
Quantity is
less than 60. Include in this
list any observations where the customer
name
(Customer) is Pet's are Us.;
Ans
proc import datafile=
'/folders/myfolders/Practice/Sales.xls' dbms=xls out=Sales Replace;
run;
proc print data=Sales
noobs;
where Customer eq
"Pet's are Us" OR (Region eq 'North' and Quantity lt 60) ;
run;
Performing Iterative Processing: Looping
*Questin10;
You are
testing three speed-reading methods (A, B, and C) by randomly assigning
10 subjects to
each of the three methods. You are given the results as three lines of
reading
speeds, each line representing the results from each of the three methods,
respectively.
Here are the results:
250 255 256 300 244 268
301 322 256 333
267 275 256 320 250 340
345 290 280 300
350 350 340 290 377 401
380 310 299 399
Create a
temporary SAS data set from these three lines of data. Each observation
should contain
Method (A, B, or C), and Score. There should be 30 observations in
this data set.
Use a DO loop to create the Method variable and remember to use a
single
trailing @ in your INPUT statement. Provide a listing of this data set using
Ans
PROC PRINT;
data speed_test;
do Method_variable='A','B','C';
do subj=1 to 10;
input Scores @@;
output;
end;end;
datalines;
250 255 256 300 244 268 301 322 256 333
267 275 256 320 250 340 345 290 280 300
350 350 340 290 377 401 380 310 299 399
;
run;
proc print noobs;
run;
Working with Dates
*Data set HOSP;
data hosp;
do j = 1 to 1000;
AdmitDate =
int(ranuni(1234)*1200 + 15500);
quarter = intck('qtr','01jan2002'd,AdmitDate);
do i = 1 to
quarter;
if ranuni(0)
lt .1 and weekday(AdmitDate) eq 1 then
AdmitDate
= AdmitDate + 1;
if ranuni(0)
lt .1 and weekday(AdmitDate) eq 7 then
AdmitDate
= AdmitDate - int(3*ranuni(0) + 1);
DOB =
int(25000*Ranuni(0) + '01jan1920'd);
DischrDate =
AdmitDate + abs(10*rannor(0) + 1);
Subject + 1;
output;
end;
end;
drop i j;
format AdmitDate
DOB DischrDate mmddyy10.;
run;
proc print data=hosp (obs=10);
run;
*Question4
Using the Hosp data set, compute the subject’s ages two
ways: as of January 1, 2006
(Call it AgeJan1), and as of today’s date (call it
AgeToday). The variable DOB
represents the date of birth. Take the integer portion of
both ages. List the first 10
observations.;
Ans
data AgeJan1;
set hosp;
Age=yrdif(DOB,'01Jan2006'd,'Actual');
run;
title "Listing of AGES1";
proc print data=AgeJan1 (obs=10);
format Age 5.1;
run;
data AgeToday;
set hosp;
Age=yrdif(DOB,Today(),'Actual');
run;
title "Listing of AGES1";
proc print data=AgeToday (obs=10);
format Age 5.1;
run;














