Introduction
Data analysis is vital in the real estate industry since it allows a real estate company to gain competitive advantages over its rivals. D.M. Pan Real Estate Company is keen on outperforming its rivals by assisting potential homeowners, and home sellers make informed decisions. The entity's sales department needs to be highly knowledgeable in the relationship between key variables such as selling price, square footage, and location to ensure it offers the best advice to its clients. The report shows the findings from an initial analysis of the link between the property selling price and square feet in the East North Central region.
Representative Data Sample
The simple random sample of 30 is indicated below.
The chosen region is the East North Central region. The average, median, and standard deviation of the square foot and listing price variables are indicated below.
Delegate your assignment to our experts and they will do the rest.
Listing price | Square foot | |
mean | $238,880 | 1,920 |
median | $223,350 | 1,747 |
standard deviation | $87,207.95 | 866.54 |
Data Analysis
The regional sample does not reflect the national market. A comparison of the descriptive statistics of regional and national listing prices shows that the mean, median, and standard deviation of the national market are significantly greater than those of the East North Central region. In addition, the mean and median square feet for the national market are also higher than those of the East North Central regional market. The square feet standard deviation for the East North Central region is higher than that of the national market. In this respect, the square foot data for the East North Central region has a greater spread around the mean compared to the square foot data for the national market.
The sample has been made random through the use of the RAND function in excel. The RAND function has been used to create a random decimal number for all East-North-Central entries in the excel file. After the random numbers are generated, they have been ranked from the smallest to the largest, with the first thirty entries being selected to be the sample. The listing prices and square feet have been ranked based on the ranking of the random numbers. In this case, a truly random sample has been generated.
Scatterplot
The scatterplot for the random sample is depicted below. A trendline and the regression equation are indicated on the chart.
The Pattern
Based on the graph, there are two key variables, namely the x-variable and y-variable. The x-variable is regarded as the independent variable, and it represents the square feet of the specific house. The y-variable is the dependent sample given that it is affected by the x-variable. The house listing price is dependent on the size of the house, which is measured in square feet. In this case, if the square foot of a house is known, the house’s listing price can be easily predicted. Resultantly, the independent variable is useful for making reliable predictions.
Based on the assessment of the scatterplot, it is evident that there is a positive linear connection between the size of the house and its listing price. The values of the two variables increase together, as indicated by the distribution of the different data points on the chart. An increase in the house's square footage is associated with an increase in its listing price. The scatterplot's shape is linear, given that a straight line that best fits the data points can be drawn. There are potential outliers in the generated scatterplot, for instance, the house in Greene country with a listing price and square footage of $581,800 and 5146, respectively. The outliers appear in the scatterplot due to the sample’s high level of diversity. They represent the data points spread the furthest from the mean.
Regression equation | Y=91.588x+63029 |
Price for 1800 square foot house | =91.588*1800+63029=$227,887.4 |