0

I have a list of products and every product have a string filed that is contain a list of tags that are concate with "#" character like this:

Tag1#Tage2#Tag3

I need to get all tags and order them by their number of their repeats.

I actually did this like this:

List<string> t = new List<string>();

var tags = (from p in db.Products
                    where p.Active
                    select p.Tags
                    ).ToList();

foreach (var item in tags)
{
   if (item == null)
      continue;
   var d = item.Split('#');
   foreach (var item2 in d)
   {
      t.Add(item2);
   }
}

var ans = t.GroupBy(p => new { id = p }).Select(g => new { id = g.Key.id, total = g.Count() }).OrderByDescending(g => g.total).ToList();

but im sure its not simple (and maybe optimized). Can someone help me to make this code simpler and better? for example with Linq statement etc..

1 Answer 1

2

Here's my variant:

using System;
using System.Linq;

namespace TagsSplitExample
{
    public class Product
    {
        public bool Active { get; set; }
        public string Tags { get; set; }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var products = new[]
            {
                new Product{ Active = true, Tags = "Tag1"},
                new Product{ Active = true, Tags = "Tag1#Tag2"},
                new Product{ Active = true, Tags = "Tag1#Tag2#Tag3"},
            };

            var allTags = products
                .Where(p => p.Active && p.Tags != null)
                .Select(p => p.Tags)
                .Select(tags => tags.Split('#'))
                .SelectMany(tag => tag)
                .GroupBy(tag => tag)
                .Select(group => new { Tag = group.Key, Count = group.Count() })
                .OrderByDescending(pair => pair.Count)
                .ToList();

            allTags.ForEach(pair => Console.WriteLine($"{pair.Tag} : {pair.Count}"));

            Console.ReadLine();
        }
    }
}

Final ToList() can be omitted if you just need to enumerate result.

Result:

Tag1 : 3
Tag2 : 2
Tag3 : 1
Sign up to request clarification or add additional context in comments.

3 Comments

great!. but there is a small different. i read products form db and they are not in memory. so i have problem in this line Select(tags => tags.Split('#')). i can solve this by pulling data into memory by this .Select(p => p.Tags).ToList() but product table is really big and its fear me. have any idea?
what is your database?
You need to do data processing on your server. For example: learn.microsoft.com/en-us/sql/t-sql/functions/… Or you can create Database View with denormalized tags.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.