-2

I have 16 collections of 8 numbers

(11 12 11 44 11 12 11 23)
(12 21 11 44 11 12 11 23)
(11 42 21 13 12 21 31 14)
(11 42 21 13 12 21 11 34)
and so on

I want to find whether there are repetitions of each 8-tuple. The order in which the numbers occur is not important.

The result will return the number of occurrences for each pattern.

(11 12 11 44 11 12 11 23) 1
(12 21 11 44 11 12 11 23) 1
(11 42 21 13 12 21 31 14) 2
(11 42 21 13 12 21 11 34) 1
Vera
  • 1,223
  • I am struggling on how an algorithm might work for this. – Vera Dec 31 '22 at 17:36
  • 3
    OK, what do you have so far? This isn't a Linux & Unix issue, you are asking us to just write a program for you and this isn't what the site is for. Show us what you have, and we'll be happy to help with it, but here you are just dumping the problem on us with what looks like no attempt on your part. Hint: start by reading each line, sorting the numbers and counting. – terdon Dec 31 '22 at 17:43
  • Hints is what I need to attempt this, at least on the important bits. Sorting the numbers is of good help here. Did not think about that. – Vera Dec 31 '22 at 18:00
  • 1
    No, all you need to attempt is to actually try something. Spend a little time searching the internet using key words like "sort" and "unique". This is really, really basic stuff but you do need to put in a bit of effort. You don't need to get it perfect, just do enough to show people that you aren't just expecting others to do the work for you and that instead you want to do the work but need help with specific parts of it. – terdon Dec 31 '22 at 18:13

1 Answers1

1

Here is one method. I started with this input file that includes some repeat lines:

cat tupl
11 12 11 44 11 12 11 23
12 21 11 44 11 12 11 23
12 21 11 44 11 12 11 23
12 21 11 44 11 12 11 23
12 21 11 44 11 12 11 23
11 42 21 13 12 21 31 14
11 42 21 13 12 21 11 34
11 12 11 44 11 12 11 23

Since you say the order of numbers in each row is unimportant we can sort numbers of each row first:

awk ' {split( $0, a, " " ); asort( a ); for( i = 1; i <= length(a); i++ ) printf( "%s ", a[i] ); printf( "\n" ); }' tupl 
11 11 11 11 12 12 23 44 
11 11 11 12 12 21 23 44 
11 11 11 12 12 21 23 44 
11 11 11 12 12 21 23 44 
11 11 11 12 12 21 23 44 
11 12 13 14 21 21 31 42 
11 11 12 13 21 21 34 42 
11 11 11 11 12 12 23 44 

Now you can use the associative arrays idea of awk to count identical rows by piping that result into this command:

awk '{a[$0]++} END {for (i in a) print i, a[i]}'
11 12 13 14 21 21 31 42  1
11 11 11 11 12 12 23 44  2
11 11 12 13 21 21 34 42  1
11 11 11 12 12 21 23 44  4
dhm
  • 361
  • Note that the input format contains parentheses and that asort() is not a standard awk function, nor is the way you use length standard. – Kusalananda Dec 31 '22 at 23:57
  • You might want to include some input lines that are different orderings of the same numbers, just to show the reordering works. (Not that this really belongs in unix.se, but anyway.) – ilkkachu Jan 01 '23 at 10:13