Jak przeprowadzić wyszukiwanie pełnotekstowe w MongoDB?

MongoDB, jedna z wiodących baz danych NoSQL, jest dobrze znana z wysokiej wydajności, wszechstronnego schematu, skalowalności i świetnych możliwości indeksowania. Przyjrzyjmy się kontekstowi, zanim przejdziemy do szczegółów. Wyszukiwanie pełnotekstowe to podstawowa funkcja, gdy mówimy o wyszukiwaniu treści w Internecie. Wyszukiwarka google jest tego najlepszym przykładem, gdy widzimy treść za pomocą fraz lub słów kluczowych. W tym artykule dowiemy się o możliwościach wyszukiwania pełnotekstowego w MongoDB na podstawie indeksu tekstowego.

Utwórz przykładową bazę danych

Zanim zaczniemy, utworzymy przykładową bazę danych, która będzie używana podczas samouczka.

Stworzymy bazę danych o nazwie myDB i utwórz kolekcję o nazwie książki . W tym celu oświadczenie byłoby następujące.

> use myDB
> db.createCollection("books")

Wstawmy kilka dokumentów, korzystając z poniższej instrukcji.

> db.books.insert([
      "title": "Eloquent JavaScript, Second Edition",
      "subtitle": "A Modern Introduction to Programming",
      "author": "Marijn Haverbeke",
      "publisher": "No Starch Press",
      "description": "JavaScript lies at the heart of almost every modern web application, from social apps to the newest browser-based games. Though simple for beginners to pick up and play with, JavaScript is a flexible, complex language that you can use to build full-scale applications."
      "title": "Learning JavaScript Design Patterns",
      "subtitle": "A JavaScript and jQuery Developer's Guide",
      "author": "Addy Osmani",
      "publisher": "O'Reilly Media",
      "description": "With Learning JavaScript Design Patterns, you'll learn how to write beautiful, structured, and maintainable JavaScript by applying classical and modern design patterns to the language. If you want to keep your code efficient, more manageable, and up-to-date with the latest best practices, this book is for you."
      "title": "Speaking JavaScript",
      "subtitle": "An In-Depth Guide for Programmers",
      "author": "Axel Rauschmayer",
      "publisher": "O'Reilly Media",
      "description": "Like it or not, JavaScript is everywhere these days, from browser to server to mobile and now you, too, need to learn the language or dive deeper than you have. This concise book guides you into and through JavaScript, written by a veteran programmer who once found himself in the same position."
      "title": "Programming JavaScript Applications",
      "subtitle": "Robust Web Architecture with Node, HTML5, and Modern JS Libraries",
      "author": "Eric Elliott",
      "publisher": "O'Reilly Media",
      "description": "Take advantage of JavaScript's power to build robust web-scale or enterprise applications that are easy to extend and maintain. By applying the design patterns outlined in this practical book, experienced JavaScript developers will learn how to write flexible and resilient code that's easier-yes, easier-to work with as your codebase grows."
      "title": "Understanding ECMAScript 6",
      "subtitle": "The Definitive Guide for JavaScript Developers",
      "author": "Nicholas C. Zakas",
      "publisher": "No Starch Press",
      "description": "ECMAScript 6 represents the biggest update to the core of JavaScript in the history of the language. In Understanding ECMAScript 6, expert developer Nicholas C. Zakas provides a complete guide to the object types, syntax, and other exciting changes that ECMAScript 6 brings to JavaScript."

Tworzenie indeksu tekstowego

Aby przeprowadzić wyszukiwanie tekstu, musimy utworzyć indeks tekstowy na polach. Możemy to stworzyć na jednym lub wielu polach. Poniższa instrukcja utworzy indeks tekstowy w jednym polu.


Utworzymy indeks tekstowy w opisie i napisy pola dla tego samouczka. W MongoDB możemy utworzyć tylko jeden indeks tekstowy na kolekcję. Dlatego utworzymy złożony indeks tekstowy za pomocą następującej instrukcji.


Teraz spróbujemy wyszukać dokumenty zawierające w opisie słowa kluczowe „ECMAScript” i napisy pola. W tym celu możemy użyć poniższego oświadczenia.

db.books.find({$text: {$search: "ECMAScript"}})


>db.books.find({$text: {$search: "ECMAScript"}},{ subtitle: 1, description: 1 })
    "_id" : ObjectId("602b09cb3cb6144ada1c62fe"),
    "subtitle" : "The Definitive Guide for JavaScript Developers",
    "description" : "ECMAScript 6 represents the biggest update to the core of JavaScript in the history of the language. In Understanding ECMAScript 6, expert developer Nicholas C. Zakas provides a complete guide to the object types, syntax, and other exciting changes that ECMAScript 6 brings to JavaScript."


Możesz wyszukiwać frazy za pomocą indeksu tekstowego. Domyślnie wyszukiwanie tekstowe wykonuje wyszukiwanie OR wszystkich słów we frazie. Jeśli chcesz wyszukać „nowoczesne wzorce projektowe”, wyszuka dokumenty ze słowami kluczowymi nowoczesny, projekt lub wzorce.


>db.books.find({$text: {$search: "modern design patterns"}},{ subtitle: 1, description: 1 })
    "_id" : ObjectId("602b098f3cb6144ada1c2ea1"),
    "subtitle" : "A JavaScript and jQuery Developer's Guide",
    "description" : "With Learning JavaScript Design Patterns, you'll learn how to write beautiful, structured, and maintainable JavaScript by applying classical and modern design patterns to the language. If you want to keep your code efficient, more manageable, and up-to-date with the latest best practices, this book is for you."
    "_id" : ObjectId("602b09b93cb6144ada1c4bca"),
    "subtitle" : "Robust Web Architecture with Node, HTML5, and Modern JS Libraries",
    "description" : "Take advantage of JavaScript's power to build robust web-scale or enterprise applications that are easy to extend and maintain. By applying the design patterns outlined in this practical book, experienced JavaScript developers will learn how to write flexible and resilient code that's easier-yes, easier-to work with as your code base grows.",
    "_id" : ObjectId("602b095c3cb6144ada1c1028"),
    "subtitle" : "A Modern Introduction to Programming",
    "description" : "JavaScript lies at the heart of almost every modern web application, from social apps to the newest browser-based games. Though simple for beginners to pick up and play with, JavaScript is a flexible, complex language that you can use to build full-scale applications."

Jeśli chcesz razem wyszukać dokładne frazy, takie jak dokumenty zawierające „nowoczesne wzorce projektowe”, możesz to zrobić, określając w wyszukiwanym tekście podwójne cudzysłowy.


>db.books.find({$text: {$search: "\"modern design patterns\""}},{ subtitle: 1, description: 1 })
    "_id" : ObjectId("602b098f3cb6144ada1c2ea1"),
    "subtitle" : "A JavaScript and jQuery Developer's Guide",
    "description" : "With Learning JavaScript Design Patterns, you'll learn how to write beautiful, structured, and maintainable JavaScript by applying classical and modern design patterns to the language. If you want to keep your code efficient, more manageable, and up-to-date with the latest best practices, this book is for you."


Jeśli chcesz wykluczyć dokumenty zawierające określone słowo, możesz użyć wyszukiwania negacji. Na przykład, jeśli zamierzasz przeszukać wszystkie dokumenty za pomocą „JavaScript”, ale nie „HTML5” lub „ECMAScript”, możesz wyszukiwać jak w poniższym przykładzie.


>db.books.find({$text: {$search: "JavaScript -HTML5 -ECMAScript"}},{ subtitle: 1, description: 1 })
    "_id" : ObjectId("602b098f3cb6144ada1c2ea1"),
    "subtitle" : "A JavaScript and jQuery Developer's Guide",
    "description" : "With Learning JavaScript Design Patterns, you'll learn how to write beautiful, structured, and maintainable JavaScript by applying classical and modern design patterns to the language. If you want to keep your code efficient, more manageable, and up-to-date with the latest best practices, this book is for you."
    "_id" : ObjectId("602b09a83cb6144ada1c4973"),
    "subtitle" : "An In-Depth Guide for Programmers",
    "description" : "Like it or not, JavaScript is everywhere these days, from browser to server to mobile and now you, too, need to learn the language or dive deeper than you have. This concise book guides you into and through JavaScript, written by a veteran programmer who once found himself in the same position."
    "_id" : ObjectId("602b095c3cb6144ada1c1028"),
    "subtitle" : "A Modern Introduction to Programming",
    "description" : "JavaScript lies at the heart of almost every modern web application, from social apps to the newest browser-based games. Though simple for beginners to pick up and play with, JavaScript is a flexible, complex language that you can use to build full-scale applications."

Wynik wyszukiwania tekstu

Wyszukiwanie tekstowe zapewnia punktację dla każdego dokumentu, reprezentującą trafność dokumentu z zapytaniem wyszukiwania. Ten wynik może służyć do sortowania wszystkich rekordów zwróconych w wynikach wyszukiwania. Wyższy wynik wskaże najtrafniejsze dopasowanie.


>db.books.find({$text: {$search: "JavaScript "}},{score: {$meta: "textScore"}, subtitle: 1, description: 1 }).sort({score:{$meta:"textScore"}})
    "_id" : ObjectId("602b098f3cb6144ada1c2ea1"),
    "subtitle" : "A JavaScript and jQuery Developer's Guide",
    "description" : "With Learning JavaScript Design Patterns, you'll learn how to write beautiful, structured, and maintainable JavaScript by applying classical and modern design patterns to the language. If you want to keep your code efficient, more manageable, and up-to-date with the latest best practices, this book is for you.",
    "score" : 1.43269230769231
    "_id" : ObjectId("602b09cb3cb6144ada1c62fe"),
    "subtitle" : "The Definitive Guide for JavaScript Developers",
    "description" : "ECMAScript 6 represents the biggest update to the core of JavaScript in the history of the language. In Understanding ECMAScript 6, expert developer Nicholas C. Zakas provides a complete guide to the object types, syntax, and other exciting changes that ECMAScript 6 brings to JavaScript.",
    "score" : 1.42672413793103
    "_id" : ObjectId("602b09a83cb6144ada1c4973"),
    "subtitle" : "An In-Depth Guide for Programmers",
    "description" : "Like it or not, JavaScript is everywhere these days, from browser to server to mobile and now you, too, need to learn the language or dive deeper than you have. This concise book guides you into and through JavaScript, written by a veteran programmer who once found himself in the same position.",
    "score" : 0.818181818181818
    "_id" : ObjectId("602b095c3cb6144ada1c1028"),
    "subtitle" : "A Modern Introduction to Programming",
    "description" : "JavaScript lies at the heart of almost every modern web application, from social apps to the newest browser-based games. Though simple for beginners to pick up and play with, JavaScript is a flexible, complex language that you can use to build full-scale applications.",
    "score" : 0.801724137931034
    "_id" : ObjectId("602b09b93cb6144ada1c4bca"),
    "subtitle" : "Robust Web Architecture with Node, HTML5, and Modern JS Libraries",
    "description" : "Take advantage of JavaScript's power to build robust web-scale or enterprise applications that are easy to extend and maintain. By applying the design patterns outlined in this practical book, experienced JavaScript developers will learn how to write flexible and resilient code that's easier-yes, easier-to work with as your codebase grows.",
    "score" : 0.792857142857143

Słowa zatrzymania

Operator $text odfiltrowuje słowa stop specyficzne dla języka, takie jak a, an, the i w języku angielskim. Poniższe wyszukiwanie nie zwróci żadnego dokumentu w wynikach.


>db.books.find({$text: {$search: "is"}},{subtitle: 1, description: 1 })
	Fetched 0 record(s)

Słowa tematyczne

Operator $text dopasowuje całe słowo macierzyste. Jeśli więc jakieś pole dokumentu zawiera słowo „uczenie się lub uczenie się”, wyszukiwanie terminu „uczenie się lub uczenie się” dałoby to samo.


>db.books.find({$text: {$search: " learn"}},{subtitle: 1, description: 1 }) or >db.books.find({$text: {$search: " learning"}},{subtitle: 1, description: 1 })
    "_id" : ObjectId("602b098f3cb6144ada1c2ea1"),
    "subtitle" : "A JavaScript and jQuery Developer's Guide",
    "description" : "With Learning JavaScript Design Patterns, you'll learn how to write beautiful, structured, and maintainable JavaScript by applying classical and modern design patterns to the language. If you want to keep your code efficient, more manageable, and up-to-date with the latest best practices, this book is for you."
    "_id" : ObjectId("602b09a83cb6144ada1c4973"),
    "subtitle" : "An In-Depth Guide for Programmers",
    "description" : "Like it or not, JavaScript is everywhere these days, from browser to server to mobile and now you, too, need to learn the language or dive deeper than you have. This concise book guides you into and through JavaScript, written by a veteran programmer who once found himself in the same position."
    "_id" : ObjectId("602b09b93cb6144ada1c4bca"),
    "subtitle" : "Robust Web Architecture with Node, HTML5, and Modern JS Libraries",
    "description" : "Take advantage of JavaScript's power to build robust web-scale or enterprise applications that are easy to extend and maintain. By applying the design patterns outlined in this practical book, experienced JavaScript developers will learn how to write flexible and resilient code that's easier-yes, easier-to work with as your codebase grows."


Mam nadzieję, że nauczyłeś się dzisiaj czegoś nowego. Oto interesujący artykuł na temat Self-Hosted MongoDB. Zapraszam również do samodzielnego wypróbowywania rzeczy i dzielenia się swoimi doświadczeniami w sekcji komentarzy. Ponadto, jeśli napotkasz jakiekolwiek problemy z którąkolwiek z powyższych definicji, możesz zapytać mnie w sekcji komentarzy poniżej.

