2 Dec 2022

77

What is Raw Data and How Can I Use It?

Format: APA

Academic level: College

Paper type: Term Paper

Words: 1497

Pages: 5

Downloads: 0

Preliminary 

Raw data is data which is organized into columns and rows. Columns represent fields that are categorized based on the variable observed title. There is five variable hence five columns and observation made are recorded in rows with a total of 200 entries for each variable. There are no empty cells and all observations are entered in ordinal form. Three variables; food expenses, income, and non-mortgage debts are recorded in the form of continuous data. Annual expenses which form the first three columns of my data are quantitative variable. The last two columns represent the qualitative variable region and location. Observation for qualitative variables is categorical and recorded in the form of discrete data. 

Descriptive analysis has been conducted based on two factors; qualitative variable and quantitative variables. The results give a summary of regions and comprehensive data. Inferential statistics are conducted to establish if there is a significant relationship between income and expenses. To get wider the data has been analyzed based on 200 entries thus showing results for combined data. Consecutive analysis has been analyzed based on region. One of the limitations of my analysis is that it has used one inferential analysis method. To compare findings one can conduct a t-test to find if the results finding will be similar based on data used and hypothesis being tested. 

It’s time to jumpstart your paper!

Delegate your assignment to our experts and they will do the rest.

Get custom essay

Descriptive Statistics 

The primary objective of descriptive statistics is to organize my raw data into a useful summary for consumers. Consumers for my report include but not limited to producers, entrepreneurs, local and national government, NGOs, higher education learning and research, and community-based organizations. Descriptive statistics give a summary report of raw data for consumer food data. Summary reports generated on my analysis will include charts and tables. The tabulated report will be for qualitative and quantitative variables. ‘Geographic Region’ and ‘location’ are quantitative data. Annual expenses which include Food Spending per Household, Household Income, and Non-Mortgage Household Debt are qualitative variables. The geographic region will be the principles independent variable through which other variables will be analyzed and tabulated. Key tabulation approach will be frequency tables which will also be used to generate charts. Once fully analyzed my analyses will be summarized further with the cross-tabulated table (Lane, n.d). 

Frequency Distribution 

The data has five variables with 200 entries for observation. Frequency distribution of observation can be analyzed based on region and location. Figure 1.0 represents the distribution of participants based on region. Count for households from metropolitan and outside metro areas are tabulated in table 1.0 and Figure 1.1 below. 

Figure 1.0 

Table 1.0 

Figure 1.1 

Quantitative data of income, food expenditures, and non-mortgages loans are tabulated using descriptive statistics in an excel data analysis pack. Results and further analysis are shown in table 1.1 and 1.2. 

Table 1.1 

Table 1.2 

Results 

There are 60 entries from Northeast, 45 Midwest, 40 South, and 55 from West. 

Figure 1.2 

Analysis per region was as follows 

Northeast 

Table 1.3 

Midwest 

Table 1.4 

South 

Table 1.5 

West 

Table 1.6 

Results 

Households that participated in the research were from Metropolitan or outside the metropolitan area. Distribution of participants in the metropolitan and outside metro area was as follows; 

In the northeast, 66.7% were from the metro area while 33.3% were residents outside metropolitan areas. 

In Midwest 30 were from the metro area accounting for 66.7% and 15 (33.3%) were from the outside metro area. 

South region had 40 participants each region accounting for 50% with 20 participants from within and around metropolitan. 

West had a54.5% from the metro area accounting for 30 entries and 25 entries which translates to 45.5% from regions outside metropolitan. 

The mean score for expenses was Annual food expenses $ 8,966.07, income for households $55,552.39, and non-mortgages loans $15604.16. Total for quantitative variables was; 

Food expenses $1,793,213.00 

Income $11,110,478.07 

Non-mortgages debts $3,120,831.54 

Total income per region was as follows; 

Northeast total, $568079.00 

Midwest $2450626 

South $2020326 

West $3197795 

Expenses on food and debts 

Region  Food expenses annually  Non-mortgages debts 
Northeast  568079.00  824556.34 
Midwest  389686  576322 
South  313358  748678 
West  522090  971275 

Table 1.7 

Findings 

Results analysis based on region and location shows there more people residing within the metropolitan area than those living outside the metropolitan area. This can be observed in most of the regions apart from the south region where sample data had a 50:50 percentage score for both regions. The reason why we can deduce the metropolitan area has a higher population is that sample size applies formula. The number of individuals participating in the survey corresponds to their population. The greater the population the higher the sample size. 

To determine the region with people with higher income average income shows that the northeast had 57362.19$, Midwest 54,458$, south 51,069$, and west with 58,142$. West region had residents with the highest income and the south had lower-income earners. In relation to income and expenses regions showed the following scores on expenses distribution; 

Income region  Food expenses percentage  Non-mortgages debts percentage 
Northeast  16.33%  30.37% 
Midwest 

15.90% 

 
23.52% 
South  15.51%  37.06% 
West  16.33%  30.37% 

Table 1.8 

Inferential statistics 

Inferential statistics refers to statistical analysis methods that produce point estimates which can be used to show if there are correlation observations made. Inferential statics that has one variable under observation is univariate. A situation where two variables are compared it is known as bivariate. Statistical analysis with three or more variables being compared is known as multivariate ( Artemiou, 2009). 

Common point estimates used in inferential statistics include p-value, critical value, and t-statistic. P-value refers to the probability assigned to rejecting or fail to reject the null hypothesis. Critical value refers to point estimate which marks the boundary between the critical and non-critical regions of inferential statistical analysis. The region within the non-critical area (within the boundary of critical value) has a t-statistic range that supports the null hypothesis. To calculate point estimates in inferential statistics one must indicate their significance level. The significance level of statistical analysis shows limits in which the researcher can commit a type I error. Type I error in statistical analysis occurs when one rejects the null hypothesis when in reality it is true ( Tanbakuchi, 2009). 

Research problem 

The data provided shows household annual income, food expenses, and loan amount being serviced. The data also shows the region of the respondent and their residence in that region. Based on data provided the inferential statistics establishes the relationships between income and expenses. My analysis will first analyze comprehensively then analyze data based on regions. 

RESEARCH QUESTION 1; is there a correlation between income and expenses? 

Hypothesis 

H 0 : µ1 = µ2 & µ3 

H 1 ; µ1 ≠µ2 & µ3 

Data – raw data provided 

Test – correlation 

Decision rule 

Value  Meaning 
-(1-0.5)  Strong negative correlation 
-(0.4-0.1)  Weak negative correlation 
No correlation 
0.1 – 0.4  Weak positive correlation 
0.5 – 0.9  Strong positive correlation 
Perfect correlation 

Table 2.0 

Results 

Results for the relationship between income and expenses for 200 variable analyzed through correlation factor is as shown in table 2.0. 

Figure 2.1 

Findings 

Food expenses and income have a strong correlation. 

Income and debts have a weak negative correlation. 

Food and debts have a weak positive relation. 

Relations showing a perfect correlation are as a result of the variable being compared to its data. 

Research question 2; is there a correlation between incomes between regions? 

H 0, 2: µ1 = µ2 & µ3 

H 1, 2; µ1 ≠µ2 & µ3 

Data – raw data provided 

Test – correlation 

Value  Meaning 
-(1-0.5)  Strong negative correlation 
-(0.4-0.1)  Weak negative correlation 
No correlation 
0.1 – 0.4  Weak positive correlation 
0.5 – 0.9  Strong positive correlation 
Perfect correlation 

Table 2.2 

Results

  income NE  income MW  income SOUTH  income WEST 
income NE 1.0      
income MW 0.1 1.0    
income SOUTH 0.1 0.0 1.0  
income WEST 0.1 0.0 -0.2 1.0

Table 2.3 

Findings 

Analysis of correlation between incomes in the different region gives a correlation factor which can be approximated to zero (0). This shows there is no correlation between incomes for populations in different regions. 

Research question 3; is there a correlation between income and non-mortgages debts in each region? 

H 0, NE: µ1 = µ2 

H 1, NE; µ1 ≠µ2 

Results

  food expenses  income  debt 
food expenses 1    
income 0.94128856 1  
debt 0.238658287 0.283998809 1

Table 2.4 

Findings 

Food and income in the northeast have a strong positive correlation of 0.9 while loans and income have weak positive relations of 0.2. 

H 0, MW: µ1 = µ2 

H 1, MW; µ1 ≠µ2 

Results

  food expenses  income  debt 
food expenses 1.00    
income 0.88 1.00  
debt -0.16 -0.13 1

Table 2.5 

Findings 

There is a strong positive correlation between food and income in Midwest but debts have a weak correction with an income of -0.1. 

  food expenses  income  debt 
food expenses 1.00    
income 0.88 1.00  
debt -0.16 -0.13 1

Table 2.6 

H 0, S: µ1 = µ2 

H 1, S; µ1 ≠µ2 

Results 

  food expenses  income  debt 
food expenses 1.00    
income 0.85 1.00  
debt 0.06 -0.01 1.00

Table 2.7 

Findings 

0.85 correlation factor for food shows a strong positive correlation in food and income in the south region. 

Debt scores can be approximated to zero (one decimal place) showing there is no correlation between income and debts in the south. 

H 0, W: µ1 = µ2 

H 1, W; µ1 ≠µ2 

Results  

  food expenses  income  debt 
###food expenses 1.00    
income 0.71 1.00  
debt -0.11 -0.23 1.00

Table 2.8 

Findings 

Income and debts correlation in the west is weak with a score of -0.2. scores for food (0.7) indicate there is a strong relationship between how people spend and the amount they are earning. 

Conclusion 

Based on descriptive analysis results based on the region it can conclude the following business-related relations. In all regions business activities are higher in metropolitan areas than outside metro areas due to the high population. Secondly, the west and Midwest regions are better regions to invest in. West region has a population with higher income hence more to spend. Midwest region analysis of average salary though lower than that of West region their expenses on non-mortgages debts are lower 23.52% compared to 30.37%. The difference in expenses on loans in the Midwest means the population can also have more to spend. 

Because of low income in south household heads are forced to borrow loans hence a higher rate of non-mortgages debts being serviced. The residence of the south spends more than 50% of its income on food and debts reducing cash at disposal and thus not a priority to invest. The average expenses in all regions for food can be approximated to 16%. The common relation between all regions is because food is a common good. Expenses in food also have a linear correlation to income at disposal as income increase, expenses on food also increase. The negative correlation between income and expenses indicate overspending while weak positive shows income as a factor has a little influence on how respondent pay their loans. 

Reference 

Artemiou, A. (2009). Measures of Variability [pdf]. Retrieved from: https://pages.mtu.edu/~aartemio/Courses/Stat318/Lectures/Chapter1/Chapter1_Lecture4.pdf 

Data Analysis with Excel. (2016). Retrieved from: Retrieved from http://acpacommissiononassessment.pbworks.com/f/Basic+Quantitative+Analysis+Using+Excel.pdf 

Lane, D.M. (n.d). 10. Estimation [e-book]. Retrieved from: http://onlinestatbook.com/2/estimation/estimation.pdf 

Tanbakuchi, A. (2009). Measures of Variation [Pdf]. Retrieved from: https://www.baruch.cuny.edu/sacc/documents/MeasuresofVariation.pdf 

Illustration
Cite this page

Select style:

Reference

StudyBounty. (2023, September 15). What is Raw Data and How Can I Use It?.
https://studybounty.com/17-what-is-raw-data-and-how-can-i-use-it-term-paper

illustration

Related essays

We post free essay examples for college on a regular basis. Stay in the know!

17 Sep 2023
Statistics

Scatter Diagram: How to Create a Scatter Plot in Excel

Trends in statistical data are interpreted using scatter diagrams. A scatter diagram presents each data point in two coordinates. The first point of data representation is done in correlation to the x-axis while the...

Words: 317

Pages: 2

Views: 187

17 Sep 2023
Statistics

Calculating and Reporting Healthcare Statistics

10\. The denominator is usually calculated using the formula: No. of available beds x No. of days 50 bed x 1 day =50 11\. Percentage Occupancy is calculated as: = =86.0% 12\. Percentage Occupancy is calculated...

Words: 133

Pages: 1

Views: 150

17 Sep 2023
Statistics

Survival Rate for COVID-19 Patients: A Comparative Analysis

Null: There is no difference in the survival rate of COVID-19 patients in tropical countries compared to temperate countries. Alternative: There is a difference in the survival rate of COVID-19 patients in tropical...

Words: 255

Pages: 1

Views: 251

17 Sep 2023
Statistics

5 Types of Regression Models You Should Know

Theobald et al. (2019) explore the appropriateness of various types of regression models. Despite the importance of regression in testing hypotheses, the authors were concerned that linear regression is used without...

Words: 543

Pages: 2

Views: 175

17 Sep 2023
Statistics

The Motion Picture Industry - A Comprehensive Overview

The motion picture industry is among some of the best performing industries in the country. Having over fifty major films produced each year with different performances, it is necessary to determine the success of a...

Words: 464

Pages: 2

Views: 86

17 Sep 2023
Statistics

Spearman's Rank Correlation Coefficient (Spearman's Rho)

The Spearman’s rank coefficient, sometimes called Spearman’s rho is widely used in statistics. It is a nonparametric concept used to measure statistical dependence between two variables. It employs the use of a...

Words: 590

Pages: 2

Views: 309

illustration

Running out of time?

Entrust your assignment to proficient writers and receive TOP-quality paper before the deadline is over.

Illustration