Last active
August 29, 2015 14:18
-
-
Save adusak/d3bd46bb48ef476a912c to your computer and use it in GitHub Desktop.
du utility implemented in haskell
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| {- | This is the third assignment for IB016, semester spring 2015. | |
| Name: Adam Melkus | |
| UID: 374010 | |
| == Implementing du | |
| In this assignment you will be implementing simplified version of Unix | |
| utility @du@. This utility can be used to detect filesystem usage by | |
| files and directories. Your task is for the first time to implement whole | |
| compilable module, including commandline handling. | |
| === du usage / commandline options | |
| Your program usage should be: | |
| @ | |
| ./du [options] [files-or-directories] | |
| @ | |
| You should implement following options: | |
| @ | |
| -a, --all write counts for all files, not just directories | |
| -h, --human-readable print sizes in human readable format (e.g., 1KiB 234MiB 2GiB) | |
| --si like -h, but use powers of 1000 not 1024 (e.g., 1KB, 245MB, 2.1GB) | |
| -s, --summarize display only a total for each argument | |
| -c, --total produce a grand total | |
| --help display help and exit | |
| @ | |
| You don't need to handle invalid options, you can ignore them. | |
| Optionally, as a bonus, you can also implement: | |
| @ | |
| -d, --max-depth print the total for a directory (or file, with --all) | |
| only if it is N or fewer levels below the command line | |
| argument; --max-depth=0 is the same as --summarize | |
| @ | |
| In this case, you can assume that option and its value is not separated by space | |
| — for short version number follows immediately, while in long option it | |
| is separated by @'='@: @-d0 -d10 --max-depth=0 --max-depth=10@. | |
| === du behaviour | |
| When @du@ is run without any options, it prints sizes of all its commandline | |
| arguments: for files size is printed directly, for directories their | |
| size is summarized recursively. Files inside directories are not printed | |
| by default. If any options are given, they must precede all files and | |
| directories. If no files or directories are given, @du@ should work with | |
| current working directory (./). | |
| File size can be obtained by the function 'hFileSize' from @System.IO@ | |
| (alternatively by 'fileSize' from @System.Posix.Files@, but this is not | |
| multiplatform). Without --human-readable or -h or --si sizes are printed | |
| in kilobytes without unit (1 KiB = 1024 B). | |
| With --human-readable or -h sizes are printed with appropriate | |
| unit (using binary prefixes <https://en.wikipedia.org/wiki/Binary_prefix>) | |
| such that the value is between 1 and 1023. If --si is given, sizes | |
| are handled similarly but using 1000-based SI prefixes (and value | |
| should be between 1 and 999). Sizes should be integral, as a bonus | |
| you can add one decimal place if value is less then 10, as done by | |
| @du@ on Linux. | |
| All the other commandline options should be handled according to usage given | |
| above. If --help is given a short help summarizing options and usage should | |
| be given and all other options should be ignored. | |
| Combination of @-s@ and @-a@ is not valid, and need not be handled, the same | |
| holds for --max-depth=0 and --all. | |
| For recursive traversal you will probably need functions from @System.Directory@ | |
| module. Feel free to use any other modules in standard Haskell distribution | |
| you find suitable (that means you can use any packages from Hackage working with | |
| GHC-4.8 and GHC-4.6). | |
| ==== Notes | |
| * in basic execution (without @-s@) subdirectories are printed. | |
| * in case of error (such as permission error or directory vanishing before | |
| it can be explored) program should not stop but print an error message | |
| (on 'stderr'), you should handle only 'IOException' and you can use | |
| that it is an instance of 'Show' | |
| * you can use module @Text.Printf@ for formating | |
| * you can ignore anything for which is neither file, nor directory (such | |
| as devices, symlinks, pipes,…) | |
| * you should not ignore hidden files (on Unix beginning with '.') | |
| * on Linux @du@ is calculating file sizes based on disk allocation, | |
| sizes reported by 'hFileSize' can differ | |
| ==== Examples | |
| Order of files and directories on same level in hierarchy is not relevant and can | |
| differ on your system. Also the output in case of error need not match literally. | |
| > $ ./du --help | |
| > usage: du [options] [files] | |
| > -a, --all write counts for all files, not just directories | |
| > -h, --human-readable print sizes in human readable format (e.g., 1K 234M 2G) | |
| > --si like -h, but use powers of 1000 not 1024 | |
| > -s, --summarize display only a total for each argument | |
| > -d, --max-depth print the total for a directory (or file, with --all) only if it is N or fewer levels below the command line argument; --max-depth=0 is the same as --summarize | |
| > -c, --total produce a grand total | |
| > --help display this help and exit | |
| > | |
| > $ mkdir test; cd test | |
| > $ mkdir -p first/second third | |
| > $ dd if=/dev/zero of=a bs=1024 count=100 &> /dev/null | |
| > $ dd if=/dev/zero of=first/b bs=1024 count=200 &> /dev/null | |
| > $ dd if=/dev/zero of=first/c bs=1024 count=300 &> /dev/null | |
| > $ dd if=/dev/zero of=first/second/d bs=1024 count=1024 &> /dev/null | |
| > | |
| > $ ../du | |
| > 0 ./third | |
| > 1024 ./first/second | |
| > 1524 ./first | |
| > 1624 . | |
| > | |
| > $ ../du first third | |
| > 1024 first/second | |
| > 1524 first | |
| > 0 third | |
| > | |
| > $ ../du -c first third | |
| > 1024 first/second | |
| > 1524 first | |
| > 0 third | |
| > 1524 total | |
| > | |
| > $ ../du first | |
| > 1024 first/second | |
| > 1524 first | |
| > | |
| > $ ../du -s first | |
| > 1524 first | |
| > | |
| > $ ../du --summarize first | |
| > 1524 first | |
| > | |
| > $ ../du -h first | |
| > 1.0 MiB first/second | |
| > 1.4 MiB first | |
| > | |
| > $ ../du --si first | |
| > 1.1 MB first/second | |
| > 1.5 MB first | |
| > | |
| > $ ./du -h -s -c first a | |
| > 1.4 MiB first | |
| > 100 KiB a | |
| > 1.5 MiB total | |
| > | |
| > $ ./du -a -h first | |
| > 200 KiB first/b | |
| > 300 KiB first/c | |
| > 1.0 MiB first/second/d | |
| > 1.0 MiB first/second | |
| > 1.4 MiB first | |
| > | |
| > $ mkdir fourth && chmod -r fourth | |
| > $ ../du fourth first | |
| > error: fourth: getDirectoryContents: permission denied (Permission denied) | |
| > 1024 first/second | |
| > 1524 first/ | |
| > | |
| > $ ./du fifth first | |
| > error: fifth: openFile: does not exist (No such file or directory) | |
| > 1024 first/second | |
| > 1524 first | |
| > | |
| > $ ../du -d1 | |
| > error: ./fourth: getDirectoryContents: permission denied (Permission denied) | |
| > 0 ./third | |
| > 1524 ./first | |
| > 1624 . | |
| > | |
| > $ ../du --max-depth=1 --human-readable | |
| > error: ./fourth: getDirectoryContents: permission denied (Permission denied) | |
| > 0.0 B ./third | |
| > 1.4 MiB ./first | |
| > 1.5 MiB | |
| -} | |
| module Main ( main ) where | |
| import Control.Exception | |
| import Control.Monad (foldM, forM_, liftM, when) | |
| import Data.List | |
| import Data.Prefix.Units | |
| import Numeric | |
| import System.Console.GetOpt | |
| import System.Directory | |
| import System.Environment | |
| import System.FilePath | |
| import System.IO | |
| import Text.Printf | |
| ------------------------------------- Option handling ------------------------------------------- | |
| data ObjectType = Directory | File | Error deriving (Eq, Show) | |
| data Options = Options | |
| { optHelp :: Bool | |
| , optAll :: Bool | |
| , optReadable :: Bool | |
| , optReadableSi :: Bool | |
| , optSumarize :: Bool | |
| , optTotal :: Bool | |
| } deriving Show | |
| defaultOptions = Options | |
| { optHelp = False | |
| , optAll = False | |
| , optReadable = False | |
| , optReadableSi = False | |
| , optSumarize = False | |
| , optTotal = False | |
| } | |
| options :: [OptDescr (Options -> Options)] | |
| options = | |
| [ Option "a" ["all"] | |
| (NoArg (\ opts -> opts { optAll = True })) | |
| "write counts for all files, not just directories" | |
| , Option "h" ["human-readable"] | |
| (NoArg (\ opts -> opts { optReadable = True })) | |
| "print sizes in human readable format (e.g., 1KiB 234MiB 2GiB)" | |
| , Option "" ["si"] | |
| (NoArg (\ opts -> opts { optReadableSi = True })) | |
| "like -h, but use powers of 1000 not 1024 (e.g., 1KB, 245MB, 2.1GB)" | |
| , Option "s" ["summarize"] | |
| (NoArg (\ opts -> opts { optSumarize = True })) | |
| "display only a total for each argument" | |
| , Option "c" ["total"] | |
| (NoArg (\ opts -> opts { optTotal = True })) | |
| "produce a grand total" | |
| , Option "" ["help"] | |
| (NoArg (\opts -> opts { optHelp = True })) | |
| "display help and exit" | |
| ] | |
| header = "usage: du [options] [files]\n" | |
| compilerOpts :: [String] -> IO (Options, [String]) | |
| compilerOpts argv = | |
| case getOpt RequireOrder options argv of | |
| (o, n, [] ) -> return (foldl (flip id) defaultOptions o, n) | |
| (_, _, errs) -> ioError (userError (concat errs ++ usageInfo header options)) | |
| ------------------------------------- Main ---------------------------------------------- | |
| main :: IO () | |
| main = do | |
| argv <- getArgs | |
| (opts, args) <- compilerOpts argv | |
| let folders = if null args then ["."] | |
| else reverse args | |
| if optHelp opts then putStr (usageInfo header options) | |
| else do result <- buildTree folders | |
| if optSumarize opts then printSumarization opts folders result | |
| else printNormal opts folders result | |
| when (optTotal opts) $ printTotal opts folders result | |
| printSumarization :: Options -> [String] -> [(String, Integer, ObjectType)] -> IO () | |
| printSumarization opts folders result = forM_ (filterByNames folders result) $ \(name, size, kind) -> | |
| if kind == Error | |
| then putStrLn name | |
| else putStrLn (formatSize opts size ++ "\t\t" ++ name) | |
| printNormal :: Options -> [String] -> [(String, Integer, ObjectType)] -> IO () | |
| printNormal opts folders result = forM_ result $ \(name, size, kind) -> | |
| if kind == Error | |
| then putStrLn name | |
| else when (optAll opts || kind == Directory || elem name folders) $ | |
| putStrLn (formatSize opts size ++ "\t\t" ++ name) | |
| printTotal :: Options -> [String] -> [(String, Integer, ObjectType)] -> IO () | |
| printTotal opts folders result = putStrLn (formatSize opts (total folders folders result) ++ "\t\tTotal") | |
| formatSize :: Options -> Integer -> String | |
| formatSize opts size | |
| | not (optReadable opts || optReadableSi opts) = show (size `div` 1024) | |
| | optReadable opts = showValue (Left FormatBinary) size ++"B" | |
| | optReadableSi opts = showValue (Left FormatSiKMGT) size ++"B" | |
| {-- | Calculates and formats file siz to human readable form | |
| formatSize :: Options -> Integer -> String | |
| formatSize opts size | |
| | not (optReadable opts || optReadableSi opts) = show (size `div` 1024) | |
| | optReadable opts = humanReadableByteCount False size | |
| | optReadableSi opts = humanReadableByteCount True size | |
| -- | Calculates and formats file siz to human readable form | |
| humanReadableByteCount :: Bool -> Integer -> String | |
| humanReadableByteCount si size | |
| | size < unit = show size ++ " B" | |
| | otherwise = stripChars "\"'" (round' 1 formula ++ " " ++ pre ++ "B") | |
| where | |
| unit = if si then 1000 else 1024 | |
| exp = round (logBase (fI unit) (fI size)) | |
| chars = if si then "kMGTPE" else "KMGTPE" | |
| theI = if si then "" else "i" | |
| pre = show (chars !! (exp - 1)) ++ theI | |
| formula = fI size / (fI unit ^ exp) | |
| -- | Rounds a float to specified number of decimal places | |
| round' :: RealFloat a => Int -> a -> String | |
| round' numDecimals num = showGFloat (Just numDecimals) num "" | |
| -- | Removes specified characters from string | |
| stripChars :: String -> String -> String | |
| stripChars = filter . flip notElem | |
| -- | Shortcut for fromIntegral | |
| fI :: (Num a) => Integer -> a | |
| fI = fromIntegral | |
| -} | |
| -- | Calculates total size of specified files/directories | |
| total :: [String] -> [String] -> [(String, Integer, ObjectType)] -> Integer | |
| total _ _ [] = 0 | |
| total [] _ _ = 0 | |
| total (a:as) args list = case findByName a list of | |
| Nothing -> 0 + total as args list | |
| (Just (n, s, _)) -> if not (isInFolder n args) then s + total as args list | |
| else 0 + total as args list | |
| -- | Gets a tupples from result set based on the name | |
| findByName :: String -> [(String, Integer, ObjectType)] -> Maybe (String, Integer, ObjectType) | |
| findByName _ [] = Nothing | |
| findByName name ((x,y,z):xs) = if name == x then Just (x,y,z) | |
| else findByName name xs | |
| -- | Filters result set with specified files/directories | |
| filterByNames :: [String] -> [(String, Integer, ObjectType)] -> [(String, Integer, ObjectType)] | |
| filterByNames [] _ = [] | |
| filterByNames (n:ns) list = case findByName n list of | |
| Nothing -> [] ++ filterByNames ns list | |
| Just something -> something : filterByNames ns list | |
| -- | Checks if file is contained in specified folder | |
| isInFolder :: String -> [String] -> Bool | |
| isInFolder _ [] = False | |
| isInFolder object (f:fs) = (isPrefixOf f object && f /= object) || isInFolder object fs | |
| -- | Recursively goes through the directory structure, calucating file and directory sizes | |
| -- Returns an IO list of tupples in the following format (File/Directory name, size, indicator wheather it is a directory or file) | |
| buildTree :: [String] -> IO [(String, Integer, ObjectType)] | |
| buildTree = foldM f [] | |
| where | |
| f acc object = handle handler $ do | |
| isDir <- doesDirectoryExist object | |
| if isDir | |
| then do | |
| entries <- liftM (filter isNormalFolder) (getDirectoryContents object) | |
| moreRes <- buildTree (map (object </>) entries) | |
| let folderSize = countSize object moreRes | |
| return (moreRes ++ [(object, folderSize, Directory)] ++ acc ) | |
| else do | |
| fileHandle <- openFile object ReadMode | |
| intSize <- hFileSize fileHandle | |
| hClose fileHandle | |
| return (acc ++ [(object, intSize, File)]) | |
| handler :: SomeException -> IO [(String, Integer, ObjectType)] | |
| handler e = return [(show e, 0, Error)] | |
| -- | Test to check if the object is not . or .. | |
| isNormalFolder :: String -> Bool | |
| isNormalFolder f = f /= "." && f /= ".." | |
| -- | Calcualtes directory size from a list | |
| countSize :: String -> [(String, Integer, ObjectType)] -> Integer | |
| countSize _ [] = 0 | |
| countSize root ((x,y,_):xs) | |
| | root == takeDirectory x = y + countSize root xs | |
| | otherwise = 0 + countSize root xs |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment