Как найти текст в xml файле

In our new project we have to provide a search functionality to retrieve data from hundreds of xml files. I have a brief of our current plan below, I would like to know your suggestions/improvements on this.

These xml files contain personal information, and the search is based on 10 elements in it for example last name, first name, email etc. Our current plan is to create an master XmlDocument with all the searchable data and a key to the actual file. So that when the user searches the data we first look at master file and get the the results. We will also cache the actual xml files from the recent searches so simillar searches later can be handled quickly.

Our application is a .net 2.0 web application.

asked Feb 19, 2009 at 4:59

gk.'s user avatar

First: how big are the xml files? XmlDocument doesn’t scale to “huge”… but can handle “large” OK.

Second: can you perhaps put the data into a regular database structure (perhaps SQL Server Express Edition), index it, and access via regular TSQL? That will usually out-perform an xpath search. Equally, if it is structured, SQL Server 2005 and above supports the xml data-type, which shreds data – this allows you to index and query xml data in the database without having the entire DOM in memory (it translates xpath into relational queries).

answered Feb 19, 2009 at 5:02

Marc Gravell's user avatar

Marc GravellMarc Gravell

1.0m261 gold badges2550 silver badges2888 bronze badges

4

If you can store then data in a SQL Server database then you could make use of SQL Servers in built XPath query functionality.

answered Feb 19, 2009 at 5:02

Dave Barker's user avatar

Dave BarkerDave Barker

6,2952 gold badges24 silver badges25 bronze badges

Hmm, sounds like your building a database over the top of Xml, for performance I’d be reading those files into the DB of your choice, and let it handle indexing and searching for you. If that’s not an option get really with XPath, or roll your own exhaustive search using XmlReader.

Xml is not the answer to every problem, however clean it appears to be, performance will suck.

answered Feb 19, 2009 at 5:03

MrTelly's user avatar

MrTellyMrTelly

14.6k1 gold badge48 silver badges81 bronze badges

Index your XML files. Look into http://incubator.apache.org/lucene.net/

I recently used it at my previous job to cache our SQL database for fast searching and very little overhead.

It provides fast searching of content inside xml files (all depending on how you organize your cache).

Very easy and straight forward to use.

Much easier than trying to loop through a bunch of files.

PHeiberg's user avatar

PHeiberg

29.3k6 gold badges59 silver badges81 bronze badges

answered Feb 19, 2009 at 15:26

Gautam's user avatar

GautamGautam

2,0751 gold badge24 silver badges24 bronze badges

Why dont you store the searchable data in a database table with key to the actual file? So your search would be on database table rather than xml file. I suppose this would be faster because you may index the table for faster searching.

answered Feb 19, 2009 at 5:04

Nahom Tijnam's user avatar

Nahom TijnamNahom Tijnam

4,6765 gold badges25 silver badges25 bronze badges

I would like to know how to find a string in XML file.

Say this is the XML file i have (these are the SQL server instances btw, irrelevant)

<?xml version="1.0" encoding="utf-8" ?>
<Servernames>
    <loc country="Lockheed">
        <Servername>instance1server1</Servername>
        <Servername>instance2server2</Servername>
        <Servername>10.90</Servername>
    </loc>
    <loc country="SouthAmerica">
        <Servername>Hide your heart</Servername>
        <Servername>Bonnie Tyler</Servername>
        <Servername>10.0</Servername>
    </loc>
    <loc country="Britian">
        <Servername>GreatestHits</Servername>
        <Servername>DollyParton</Servername>
        <Servername>thisis</Servername>
    </loc>
</Servernames>

So what happens is i get a string from the user in any format say for example i only get instance and then i want the listbox to display all the servernames that start with server in the above case it will be

instance1server1
instance2serve2

and so on..
Not sure how to achieve this, do i have to open stream reader or just get a string and browser thru the xml file?

UPDATED

private void button1_Click(object sender, RoutedEventArgs e)
{
    textBox1.Clear();
    string fileName = "c:\users\xxxx\documents\visual studio 2010\Projects\WpfApplication2\WpfApplication2\XML.xml";

        var doc = XDocument.Load(fileName);
        var findString = "Server";

        var results = doc.Element("Servernames").Descendants("Servername").Where(d => d.Value.Contains(findString)).Select(d => d.Value);
        listBox1.Items.Add(results.ToString());
        textBox1.Text = results.ToString();
}

i am simply getting this in the text box :System.Linq.Enumerable+WhereSelectEnumerableIterator`2[System.Xml.Linq.XElement,System.String]

enter image description here

strong textUPDATE2

.cs file code

private void button1_Click(object sender, RoutedEventArgs e)
{
textBox1.Clear();

        string fileName = "c:\users\xxxxx\documents\visual studio 2010\Projects\WpfApplication2\WpfApplication2\XML.xml";

        var doc = XDocument.Load(fileName);
        var findString = "Server";

        var results = doc.Element("Servernames").Descendants("Servername").Where(d => d.Value.Contains(findString)).Select(d => d.Value);

        Servers = new ObservableCollection<string>(results);

        MessageBox.Show("THis is loaded");

    }

XAML looks like this

<ListBox   Height="200" HorizontalAlignment="Left" Margin="200,44,0,0" x:Name="ListBox1" VerticalAlignment="Top" Width="237">

enter image description here

0 / 0 / 0

Регистрация: 26.08.2011

Сообщений: 5

1

12.09.2011, 11:23. Показов 6151. Ответов 2


Студворк — интернет-сервис помощи студентам

Здравствуйте! Подскажите пожалуйста есть ли возможность поиска внутри файла xml по словам/фразам, с помощью сторонних программ на-подобии виндузового поиска в файлах word’a?



0



Programming

Эксперт

94731 / 64177 / 26122

Регистрация: 12.04.2006

Сообщений: 116,782

12.09.2011, 11:23

2

3 / 3 / 0

Регистрация: 19.06.2011

Сообщений: 19

13.09.2011, 15:56

2

Ctrl+F практически в любом интернет-браузере.
Или Вы что-то другое имели ввиду?



0



0 / 0 / 0

Регистрация: 26.08.2011

Сообщений: 5

13.09.2011, 17:29

 [ТС]

3

Другое. В общем нашёл как это можно сделать в WinXP, в Win7 поисковик позволяет это делать без каких-либо проблем. Ещё есть notepad++ и с помощью Total Commander можно искать, но notepad++ бесплатный



0



Конечно, вы можете использовать Windows Search, как показано на рисунке ниже. После настройки вы можете использовать параметры расширенного / расширенного поиска Windows в окне поиска файлового обозревателя следующим образом:

type:xml content:"search-words"

Но я заметил, что он не найдет сам элемент XML, он найдет результаты только по данным элемента и значениям свойства. Что-то странное. Так что лучше всего использовать Notepad++, так как он работает очень хорошо.

Кроме того, позже я заметил, что вы можете искать по имени и значению поля xml следующим образом:

type:xml fieldname:"field-name" field:"field-value"

Что действительно хорошо, но, честно говоря, я не получаю последовательных результатов, поэтому используйте его только в качестве ориентира.

Настройка поиска и индексации Windows

Добавить комментарий