Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.


Get Started Now!

Shell script merge two list and remove duplicates

rajeshkumar created the topic: shell script merge two list and remove duplicates

You want all the records from list_A supplemented by all the records from list_B for which there is not already a matching name in list A. Mathematically this is:

A + B - {w in B | (w,value) in A }

\

There are many ways of accomplishing this, depending on access and needed efficiencies.

* If you can modify DB1 (with A), then download table B from DB2, upload it to DB1, then extract your data with the appropriate join
* If you can’t modify DB1, then download both A and B and concatenate them to the same stream, with A followed by B. Then sort by the first field. Then process the stream one record at time. Duplicate names will be side-by-side. If the same name appears more than one time, print the first and ignore subsequent records with the same name.

Here is a sample solution to your problem (starting with two lists of names/values)

#!/bin/bash

A="Smith value1
Jones value2
Wilson value3"

B="Smith value10
Wilson value11
Fox value12
Brown value13"

PrevName="Not a valid name"
echo "$A
$B" | sort -k1 |
while read Name Value
do
if [ "$Name" != "$PrevName" ]; then
echo $Name $Value
fi
PrevName="$Name"
done > outfile

You want all the records from list_A supplemented by all the records from list_B for which there is not already a matching name in list A. Mathematically this is:

A + B – {w in B | (w,value) in A }

There are many ways of accomplishing this, depending on access and needed efficiencies.

* If you can modify DB1 (with A), then download table B from DB2, upload it to DB1, then extract your data with the appropriate join
* If you can’t modify DB1, then download both A and B and concatenate them to the same stream, with A followed by B. Then sort by the first field. Then process the stream one record at time. Duplicate names will be side-by-side. If the same name appears more than one time, print the first and ignore subsequent records with the same name.

Here is a sample solution to your problem (starting with two lists of names/values):

#!/bin/bash

A=”Smith value1
Jones value2
Wilson value3″

B=”Smith value10
Wilson value11
Fox value12
Brown value13″

PrevName=”Not a valid name”
echo “$A
$B” | sort -k1 |
while read Name Value
do
if [ “$Name” != “$PrevName” ]; then
echo $Name $Value
fi
PrevName=”$Name”
done > outfile

Here is the output:
Brown value13
Fox value12
Jones value2
Smith value1
Wilson value11
Regards,
Rajesh Kumar
Twitt me @ twitter.com/RajeshKumarIn

Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments

Certification Courses

DevOpsSchool has introduced a series of professional certification courses designed to enhance your skills and expertise in cutting-edge technologies and methodologies. Whether you are aiming to excel in development, security, or operations, these certifications provide a comprehensive learning experience. Explore the following programs:

DevOps Certification, SRE Certification, and DevSecOps Certification by DevOpsSchool

Explore our DevOps Certification, SRE Certification, and DevSecOps Certification programs at DevOpsSchool. Gain the expertise needed to excel in your career with hands-on training and globally recognized certifications.

0
Would love your thoughts, please comment.x
()
x