Merging two data set in R based on one common column

Let’s create a new dataset using mtcars dataset and only mpg and hp column:

> cars.mpg <- subset(mtcars, select = c(mpg, hp))
 
> cars.mpg
                     mpg  hp
Mazda RX4           21.0 110
Mazda RX4 Wag       21.0 110
Datsun 710          22.8  93
Hornet 4 Drive      21.4 110
Hornet Sportabout   18.7 175
Valiant             18.1 105
Duster 360          14.3 245
Merc 240D           24.4  62
Merc 230            22.8  95
Merc 280            19.2 123
Merc 280C           17.8 123
Merc 450SE          16.4 180
Merc 450SL          17.3 180
Merc 450SLC         15.2 180
…………..

Let’s create another dataset using mtcars dataset and only hp and cyl column:

> cars.cyl <- subset(mtcars, select = c(hp,cyl))
> cars.cyl
                     hp cyl
Mazda RX4           110   6
Mazda RX4 Wag       110   6
Datsun 710           93   4
Hornet 4 Drive      110   6
Hornet Sportabout   175   8
Valiant             105   6
Duster 360          245   8
Merc 240D            62   4
Merc 230             95   4
Merc 280            123   6
Merc 280C           123   6
Merc 450SE          180   8
Merc 450SL          180   8
Merc 450SLC         180   8
…………………..

Now we can merge both dataset based on common column hp as below:

 
> merge.ds <- merge(cars.mpg, cars.cyl, by="hp")
> merge.ds
    hp  mpg cyl
1   52 30.4   4
2   62 24.4   4
3   65 33.9   4
4   66 32.4   4
5   66 32.4   4
6   66 27.3   4
7   66 27.3   4
8   91 26.0   4
9   93 22.8   4
10  95 22.8   4
11  97 21.5   4
12 105 18.1   6
13 109 21.4   4
14 110 21.0   6
15 110 21.0   6
16 110 21.0   6
17 110 21.0   6
18 110 21.0   6
19 110 21.0   6
20 110 21.4   6
21 110 21.4   6
22 110 21.4   6
23 113 30.4   4
24 123 17.8   6
25 123 17.8   6
26 123 19.2   6
27 123 19.2   6
28 150 15.2   8
29 150 15.2   8
30 150 15.5   8
31 150 15.5   8
32 175 18.7   8
33 175 18.7   6
34 175 18.7   8
35 175 19.7   8
36 175 19.7   6
37 175 19.7   8
38 175 19.2   8
39 175 19.2   6
40 175 19.2   8
41 180 16.4   8
42 180 16.4   8
43 180 16.4   8
44 180 17.3   8
45 180 17.3   8
46 180 17.3   8
47 180 15.2   8
48 180 15.2   8
49 180 15.2   8
50 205 10.4   8
51 215 10.4   8
52 230 14.7   8
53 245 14.3   8
54 245 14.3   8
55 245 13.3   8
56 245 13.3   8
57 264 15.8   8
58 335 15.0   8

Why you see total 58 merged rows when there were only 32 rows in original data sets?

This is because "merge is a generic function whose principal method is for data frames: the default method coerces its arguments to data frames and calls the "data.frame" method."
Learn more by calling
?merge
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s