web developer & system programmer

coder . cl

ramblings and thoughts on programming...


challenge 2012.02.13 intermediate

published: 14-02-2012 / updated: 14-02-2012
posted in: development, haskell, programming, tips
by Daniel Molina Wegener

Here is the another combinatorics challenge with a level intermediate on the programming forum. Still I think that they are very easy to solve. Here is the solution in Haskell, where not real combinatorics are required to supply a solution, instead word elements are used as numbers to find word matches in the text file, without too many libraries and lines of code.

Your challenge today is to write a program that can find the amount of anagrams within a .txt file. For example, “snap” would be an anagram of “pans”, and “skate” would be an anagram of “stake”.


module Main where

import Data.Char
import Data.List
import Data.Maybe
import Data.String.Utils
import System.Environment

liftLine :: String -> [Int]
liftLine l = fmap ord $ strip l

matchWord :: [Int] -> [Int] -> Maybe [Int]
matchWord w1 w2 = if sort w1 == sort w2
                     then Just w2
                     else Nothing

findMatches :: [Int] -> [[Int]] -> [String]
findMatches x xs = fmap (fmap chr . fromJust)
                   $ filter (/= Nothing)
                   $ fmap (matchWord x) xs

main :: IO ()
main = do [n, w] <- getArgs
          f <- readFile n
          let words = fmap liftLine $ lines f
              wmatch = liftLine w
              in mapM_ putStrLn $ findMatches wmatch words

Where the code above can be compiled using ghc --make prog.hs -o prog or can be executed using runghc prog.hs text-file word. With the text file bellow:

snap
pans
skate
stake
takes
kates
tekas
ketas

Its output is as follows:

13:02 [dmw@www:0 exercises]$ ./challenge20120213 challenge20120213.txt kates
skate
stake
takes
kates
tekas
ketas

Enjoy…


6 comments to “challenge 2012.02.13 intermediate”

  1. Solution in C#:

    
    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Linq;
    
    namespace testanagramas
    {
    	class MainClass
    	{
    		public static void Main (string[] args)
    		{
    			string line;
    			var words = new List<string>();
    			using (var reader = new StreamReader (args [0]))
    				while ((line = reader.ReadLine ()) != null)
    					words.Add (line.Trim ());
    			
    			var word = args [1];
    			var anagrams =
    				from w in words
    				where w.Length == word.Length
    				where w.ToCharArray ().Count (c => word.Contains (c)) == w.Length
    				select w;
    			
    			foreach (var a in anagrams)
    				Console.WriteLine (a);
    		}
    	}
    }
    
    
  2. Appears LLVM makes a Haskell/GHC program faster than Mono.

    07:40 [dmw@www:1 exercises]$ time mono challenge20120213.exe challenge20120213.txt kates > /dev/null
    
    real    0m0.058s
    user    0m0.056s
    sys     0m0.004s
    
    07:40 [dmw@www:1 exercises]$ time ./challenge20120213 challenge20120213.txt kates > /dev/null
    
    real    0m0.003s
    user    0m0.000s
    sys     0m0.000s
    
    
  3. You may try with LLVM+Mono also:

    Mono/LLVM

    I will run a benchmark later.

  4. I tested with AOT but it’s still low. I think it’s the IO layer :(

  5. Hi,

    I come here because I was googling about Haskell as I’m trying to learning it.

    I really don’t understand your the aim of your solution, why do you “instead word elements are used as numbers to find word matches in the text file”?

    The sort function of Data.List can sort a string (because they are list, of course).

    So a simpler solution based in your code is:

    
    module Main where
     
    import Data.List
    import Data.String.Utils
    import System.Environment
     
    main :: IO ()
    main = do [n, w] < - getArgs
              f <- readFile n
              let words = fmap strip $ lines f
                  wmatch = sort w
                  in mapM_ putStrLn $ filter (x -> sort x == wmatch) words
    
    

    In fact I’ve tested the performance of these two versions and the former gives a time of 643.943 seconds when is tested against a 2GiB wordlist file (with “anything” as search word), and the later solution takes only 312.493 seconds.

  6. Well, I’ve solved that problem in few minutes… your solution can also be expressed as follows.

    
    module Main where
    
    import Data.List
    import Data.String.Utils
    import System.Environment
    import System.IO.Unsafe
    
    main :: IO ()
    main = do [n, w] < - getArgs
              let wmatch = sort w
                  in mapM_ putStrLn $ filter ((wmatch ==) . sort)
                     $ fmap strip (lines (unsafePerformIO (readFile n)))
    
    

    Which is simpler than yours. But not all people understands Haskell and all its tricky notations.

post a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>