How to honour highest repeating give-and-take from a text File inwards Java - Word Count Problem

How to discovery the give-and-take together with their count from a text file is approximately other often asked coding inquiry from Java interviews. The logic to solve this occupation is similar to what nosotros convey seen inward how to discovery duplicate words inward a String. In the outset mensuration you lot demand to construct a give-and-take Map past times reading contents of a text File. This Map should incorporate give-and-take equally a fundamental together with their count equally value. Once you lot convey this Map ready, you lot tin only kind the Map based upon values. If you lot don't know how to kind a Map on values, come across this tutorial first. It volition instruct you lot past times sorting HashMap on values. Now getting fundamental together with value inward sorted should live on easy, but shout out upwardly HashMap doesn't maintain order, thence you lot demand to role a List to decease along the entry inward sorted order. Once you lot got this list, you lot tin only loop over the list together with impress each fundamental together with value from the entry. This way, you lot tin likewise create a tabular array of words together with their count inward decreasing order.  This occupation is sometimes likewise asked equally to impress all give-and-take together with their count inward tabular format.



How to discovery highest repeated give-and-take from a file

Here is the Java computer program to discovery the duplicate give-and-take which has occurred maximum seat out of times inward a file. You tin likewise impress frequency of give-and-take from highest to lowest because you lot convey the Map, which contains give-and-take together with their count inward sorted order. All you lot demand to produce is iterate over each entry of Map and impress the keys together with values.


Most of import business office of this solution is sorting all entries. Since Map.Entry doesn't implement the Comparable interface, nosotros demand to write our ain custom Comparator to kind the entries. If you lot expect at my implementation, I am comparison entries on their values because that's what nosotros want. Many programmer says that why non role the LinkedHashMap class? but remember, the LinkedHashMap bird keeps the keys inward sorted order, non the values. So you lot demand this particular Comparator to compare values together with shop them inward List.

Here is i approach to solve this occupation using map-reduce technique:

 is approximately other often asked coding inquiry from Java interviews How to discovery highest repeating give-and-take from a text File inward Java - Word Count Problem



Java Program to Print give-and-take together with their count from File

import java.io.BufferedReader; import java.io.DataInputStream; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Map.Entry; import java.util.Set; import java.util.StringTokenizer; import java.util.regex.Pattern; /**  * Java computer program to discovery count of repeated words inward a file.  *  * @author  */ public class Problem {      public static void main(String args[]) {         Map<String, Integer> wordMap = buildWordMap("C:/temp/words.txt");         List<Entry<String, Integer>> listing = sortByValueInDecreasingOrder(wordMap);         System.out.println("List of repeated give-and-take from file together with their count");         for (Map.Entry<String, Integer> entry : list) {             if (entry.getValue() > 1) {                 System.out.println(entry.getKey() + " => " + entry.getValue());             }         }     }      public static Map<String, Integer> buildWordMap(String fileName) {         // Using diamond operator for create clean code         Map<String, Integer> wordMap = new HashMap<>();         // Using try-with-resource disceptation for automatic resources management         try (FileInputStream fis = new FileInputStream(fileName);                 DataInputStream dis = new DataInputStream(fis);                 BufferedReader br = new BufferedReader(new InputStreamReader(dis))) {             // words are separated past times whitespace             Pattern designing = Pattern.compile("\\s+");             String business = null;             while ((line = br.readLine()) != null) {                 // produce this if illustration sensitivity is non required i.e. Java = java                 business = line.toLowerCase();                 String[] words = pattern.split(line);                 for (String give-and-take : words) {                     if (wordMap.containsKey(word)) {                         wordMap.put(word, (wordMap.get(word) + 1));                     } else {                         wordMap.put(word, 1);                     }                 }             }         } catch (IOException ioex) {             ioex.printStackTrace();         }         return wordMap;     }      public static List<Entry<String, Integer>> sortByValueInDecreasingOrder(Map<String, Integer> wordMap) {         Set<Entry<String, Integer>> entries = wordMap.entrySet();         List<Entry<String, Integer>> listing = new ArrayList<>(entries);         Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {             @Override             public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {                 return (o2.getValue()).compareTo(o1.getValue());             }         });         return list;     } }  Output: List of repeated give-and-take from file together with their count its => 2 of => 2 programming => 2 coffee => 2 linguistic communication => 2


Things to note

If you lot writing code on interviews brand certain they are production character code, which agency you lot must handgrip equally many errors equally possible, you lot must write unit of measurement tests, you lot must comment the code together with you lot produce proper resources management. Here are distich of to a greater extent than points to remember:

1) Close files together with streams i time you lot are through amongst it, come across this tutorial acquire correct way to closed the stream. If you lot are inward Java 7, exactly role try-with-resource statement.

2) Since the size of the file is non specified, the interviewer may grill you lot on cases similar What happens if the file is large? With a large file, your computer program volition run out of retentivity together with throw java.lang.OutOfMemory: Java Heap space. One solution for this is to produce this undertaking inward chunk e.g. outset read 20% content, discovery maximum repeated give-and-take on that, thence read side past times side 20% content together with discovery repeated maximum past times taking the previous maximum inward consideration. This way, you lot don't demand to shop all words inward retentivity together with you lot tin procedure whatever arbitrary length file.

3) Alway role Generics for type-safety.


That's all on how to discovery repeated give-and-take from a file together with impress their count. You tin apply the same technique to discovery duplicate words inward a String. Since similar a shot you lot convey a sorted listing of words together with their count, you lot tin likewise discovery the maximum, minimum or repeated words which has counted to a greater extent than than the specific number.

Further Reading
If you lot are preparing for programming undertaking interview thence you lot must fix for all-important topic e.g. information structure, string, array etc. One mass which tin care you lot on this undertaking is the Cracking the Coding Interview book. It contains 150 Programming Questions together with Solutions, which is skilful plenty to clear virtually of the coding interviews.

Subscribe to receive free email updates:

0 Response to "How to honour highest repeating give-and-take from a text File inwards Java - Word Count Problem"

Posting Komentar