当前位置:首页 > 开发 > 编程语言 > Java > 正文

Trie tree(字典树)的Java实现及其应用-统计以某字符串为前缀的单词的数量

发表于: 2012-05-12   作者:bylijinnan   来源:转载   浏览:
摘要: import java.util.LinkedList; public class CaseInsensitiveTrie { /** 字典树的Java实现。实现了插入、查询以及深度优先遍历。 Trie tree's java implementation.(Insert,Search,DFS) Problem Description Igna
import java.util.LinkedList;

public class CaseInsensitiveTrie {

	/**
	字典树的Java实现。实现了插入、查询以及深度优先遍历。 
    Trie tree's java implementation.(Insert,Search,DFS)
    
	Problem Description
	Ignatius最近遇到一个难题,老师交给他很多单词(只有小写字母组成,不会有重复的单词出现),现在老师要他统计出以某个字符串为前缀的单词数量(单词本身也是自己的前缀).

	Input
	输入数据的第一部分是一张单词表,每行一个单词,单词的长度不超过10,它们代表的是老师交给Ignatius统计的单词,一个空行代表单词表的结束.第二部分是一连串的提问,每行一个提问,每个提问都是一个字符串.
	注意:本题只有一组测试数据,处理到文件结束.

	Output
	对于每个提问,给出以该字符串为前缀的单词的数量.

	Sample Input
	banana
	band
	bee
	absolute
	acm

	ba
	b
	band
	abc
	 
	Sample Output
	2
	3
	1
	0
	*/
	private int SIZE = 26;// 26 letters.CaseInsensitive.
	private TrieNode root;

	CaseInsensitiveTrie() {
		root = new TrieNode();
	}

	private class TrieNode {
		private int num;// how many words go through this node.
		private TrieNode[] son;// TrieNode[26] in this case
		private boolean isEnd;// end of a string.
		private char val;// No this field mostly.
		// But I think it's easy to traverse when having a specific value in
		// each node.
		private boolean visited;// add this field for DFS

		TrieNode() {
			num = 1;//num=0 is wrong
			son = new TrieNode[SIZE];
			isEnd = false;
			visited = false;
		}
	}

	public void insert(String str) {
		if (str == null || str.length() == 0) {
			return;
		}
		TrieNode node = root;
		char[] letters=str.toCharArray();
		for (int i = 0, len = str.length(); i < len; i++) {
			int pos = letters[i] - 'a';
			if (node.son[pos] == null) {
				node.son[pos] = new TrieNode();
				node.son[pos].val = letters[i];
			} else {
				node.son[pos].num++;//for 'countPrefix(String prefix)'
			}
			node = node.son[pos];
		}
		node.isEnd = true;
	}

	/**
	 * count how many words start with the specific prefix.
	 * since we count it in insertion,what we need to do is to return the 'num' of the last letter of prefix.
	 */
	public int countPrefix(String prefix){
		if(prefix==null||prefix.length()==0){
			return -1;
		}
		TrieNode node=root;
		char[] letters=prefix.toCharArray();
		for(int i=0,len=prefix.length();i<len;i++){
			int pos=letters[i]-'a';
			if(node.son[pos]==null){
				return 0;
			}else{
				node=node.son[pos];
			}
		}
		return node.num;
	}
	
	// search a word in the tree.Complete matching.
	public boolean has(String str) {
		if (str == null || str.length() == 0) {
			return false;
		}
		TrieNode node = root;
		char[] letters=str.toCharArray();
		for (int i = 0, len = str.length(); i < len; i++) {
			int pos = letters[i] - 'a';
			if (node.son[pos] != null) {
				node = node.son[pos];
			} else {
				return false;
			}
		}
		return node.isEnd;
	}

	// DFS.Use stack.Like a 26-nary tree.
	public void printAllWords() {
		TrieNode rootNode = root;
		if (root == null){
			return;
		}
		LinkedList<TrieNode> list = new LinkedList<TrieNode>();
		for (int i = 0; i < SIZE; i++) {
			TrieNode node = rootNode.son[i];
			if (node != null) {
				list.addLast(node);
				while (!list.isEmpty()) {
					TrieNode curNode = list.getLast();
					TrieNode firstChild = firstChildOf(curNode);
					while (firstChild != null) {
						list.addLast(firstChild);
						firstChild = firstChildOf(firstChild);// DFS.
					}
					TrieNode strEnd = list.getLast();
					if (strEnd.isEnd) {
						printLinkedList(list);
					}
					list.removeLast();
				}
			}
			list.clear();
		}
	}

	//print each node in preOrder.
	public void preTraverse(TrieNode node){
		if(node!=null){
			System.out.print(node.val+"-");
			for(TrieNode child:node.son){
				preTraverse(child);
			}
		}
		
	}
	// get the first unvisited child of a node.
	public TrieNode firstChildOf(TrieNode node) {
		if (node == null)
			return null;
		for (int i = 0; i < SIZE; i++) {
			TrieNode tmp = node.son[i];
			if (tmp != null && !tmp.visited) {
				tmp.visited = true;
				return tmp;
			}
		}
		return null;
	}

	public static void printLinkedList(LinkedList<TrieNode> list) {
		if (list == null || list.size() == 0){
			return;
		}
		StringBuilder sb = new StringBuilder();
		for (int i = 0, len = list.size(); i < len; i++) {
			sb.append(list.get(i).val);
		}
		System.out.println(sb.toString());
	}

	public TrieNode getRoot(){
		return this.root;
	}
	
	public static void main(String[] args) {
		CaseInsensitiveTrie tree = new CaseInsensitiveTrie();
		String[] strs={
				"banana",
				"band",
				"bee",
				"absolute",
				"acm",
		};
		String[] prefix={
			"ba",
			"b",
			"band",
			"abc",
		};
		for(String str:strs){
			tree.insert(str);
		}
		System.out.println(tree.has("abc"));
		tree.preTraverse(tree.getRoot());
		System.out.println();
		tree.printAllWords();
		for(String pre:prefix){
			int num=tree.countPrefix(pre);
			System.out.println(pre+" "+num);
		}
		
	}
}



另一道类似的题目:
Description
Given a list of phone numbers, determine if it is consistent in the sense that no number is the prefix of another. Let's say the phone catalogue listed these numbers:
Emergency 911
Alice 97 625 999
Bob 91 12 54 26
In this case, it's not possible to call Bob, because the central would direct your call to the emergency line as soon as you had dialled the first three digits of Bob's phone number. So this list would not be consistent.
Input
The first line of input gives a single integer, 1 ≤ t ≤ 40, the number of test cases. Each test case starts with n, the number of phone numbers, on a separate line, 1 ≤ n ≤ 10000. Then follows n lines with one unique phone number on each line. A phone number is a sequence of at most ten digits.
Output
For each test case, output "YES" if the list is consistent, or "NO" otherwise.
Sample Input
2
3
911
97625999
91125426
5
113
12340
123440
12345
98346
Sample Output
NO
YES

这道题用TrieTree也好做:在插入某个号码时候,如果遇到一个节点已经isEnd==true而号码还没有插入完成,则返回NO

Trie tree(字典树)的Java实现及其应用-统计以某字符串为前缀的单词的数量

  • 0

    开心

    开心

  • 0

    板砖

    板砖

  • 0

    感动

    感动

  • 0

    有用

    有用

  • 0

    疑问

    疑问

  • 0

    难过

    难过

  • 0

    无聊

    无聊

  • 0

    震惊

    震惊

编辑推荐
转载自: http://blog.csdn.net/hackbuteer1/article/details/7964147 一、知识简介 最近在看字符串
假设有b,abc,abd,bcd,abcd,efg,hii这6个单词,我们构建的树就是这样的。 对于每一个节点,从
今天AC了两题trie tree的题目,感觉trie的性质真的是相当的好,而且实现比较简单。它使在字符串集合
假设有b,abc,abd,bcd,abcd,efg,hii这6个单词,我们构建的树就是这样的。 对于每一个节点,从
字典树(trie tree) 今天AC了两题trie tree的题目,感觉trie的性质真的是相当的好,而且实现比较简单
trie或者prefix tree(前缀树),是一种有序树数据结构,它通常被存储在一个以字符串为关键字的联合数组
Trie树,又称为字典树,是一种树形结构,是一种哈希树的变种,是一种用于快速检索的多叉树数据结构
一、前缀树介绍 (注:本节内容来源于网络) 定义: 所有含有公共前缀的字符串将挂在树中同一个结点
一、前缀树介绍 (注:本节内容来源于网络) 定义: 所有含有公共前缀的字符串将挂在树中同一个结点
题意:实现trie树的3个功能,只含小写字母的串。 思路:老实做即可! 1 class TrieNode { 2 public:
版权所有 IT知识库 CopyRight © 2009-2015 IT知识库 IT610.com , All Rights Reserved. 京ICP备09083238号