{"id":45514,"date":"2024-04-08T07:37:19","date_gmt":"2024-04-08T07:37:19","guid":{"rendered":"https:\/\/www.devopsschool.com\/blog\/?p=45514"},"modified":"2024-04-08T07:44:24","modified_gmt":"2024-04-08T07:44:24","slug":"apache-lucene-query-example","status":"publish","type":"post","link":"https:\/\/www.devopsschool.com\/blog\/apache-lucene-query-example\/","title":{"rendered":"Apache Lucene Query Example"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Lucene query cheatsheet<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Basic Search<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Single Term:<\/strong><code>term<\/code>\n<ul class=\"wp-block-list\">\n<li>Finds documents containing <code>term<\/code>.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Phrase Search:<\/strong><code>\"exact phrase\"<\/code>\n<ul class=\"wp-block-list\">\n<li>Finds documents containing the exact phrase.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Boolean Operators<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>AND:<\/strong><code>term1 AND term2<\/code>\n<ul class=\"wp-block-list\">\n<li>Both terms must be present.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>OR:<\/strong><code>term1 OR term2<\/code>\n<ul class=\"wp-block-list\">\n<li>At least one of the terms must be present.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>NOT:<\/strong><code>NOT term<\/code>\n<ul class=\"wp-block-list\">\n<li>Documents must not contain <code>term<\/code>.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Combination:<\/strong><code>(term1 AND term2) OR term3<\/code>\n<ul class=\"wp-block-list\">\n<li>Complex boolean logic can be applied by combining operators.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Wildcard Searches<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Single Character Wildcard:<\/strong><code>te?t<\/code>\n<ul class=\"wp-block-list\">\n<li>Matches text with one character replaced.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Multiple Character Wildcard:<\/strong><code>test*<\/code>\n<ul class=\"wp-block-list\">\n<li>Matches text with zero or more characters.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Wildcard at Start:<\/strong><code>*test<\/code>\n<ul class=\"wp-block-list\">\n<li>Not supported directly but can be used in certain contexts.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Fuzzy Searches<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Fuzzy:<\/strong><code>term~<\/code>\n<ul class=\"wp-block-list\">\n<li>Matches terms that are similar to the specified term.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Proximity Searches<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Proximity:<\/strong><code>\"term1 term2\"~N<\/code>\n<ul class=\"wp-block-list\">\n<li>Matches terms that are within N words of each other.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Range Searches<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Range:<\/strong><code>[start TO end]<\/code>\n<ul class=\"wp-block-list\">\n<li>Finds documents with terms within the specified range.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Exclusive Range:<\/strong><code>{start TO end}<\/code>\n<ul class=\"wp-block-list\">\n<li>Excludes the exact start and end values.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Regular Expressions<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Regex:<\/strong><code>\/regex\/<\/code>\n<ul class=\"wp-block-list\">\n<li>Matches terms by regular expression.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Boosting Terms<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Boost:<\/strong><code>term^N<\/code>\n<ul class=\"wp-block-list\">\n<li>Increases the relevance of a term by a factor of N.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Field-Specific Searches<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Specific Field:<\/strong><code>fieldname:term<\/code>\n<ul class=\"wp-block-list\">\n<li>Searches for the term within a specific field.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Grouping<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Group Queries:<\/strong><code>(query1) AND (query2)<\/code>\n<ul class=\"wp-block-list\">\n<li>Groups parts of queries for complex searches.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">How to search Apache HTTPD using Lucene <\/h2>\n\n\n\n<p>These examples assume that the logs have been indexed in a Lucene-based system like Elasticsearch, and they demonstrate how to utilize various Lucene query features to filter and search log data effectively. Note that the specific fields used in these examples (<code>ip<\/code>, <code>timestamp<\/code>, <code>response<\/code>, <code>request<\/code>, etc.) should correspond to the fields defined in your Lucene schema for Apache HTTPD logs.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-1\" data-shcb-language-name=\"JavaScript\" data-shcb-language-slug=\"javascript\"><span><code class=\"hljs language-javascript\"><span class=\"hljs-comment\">\/\/ 1. Find logs for a specific IP address<\/span>\n<span class=\"hljs-attr\">ip<\/span>:<span class=\"hljs-string\">\"192.168.1.1\"<\/span>\n\n<span class=\"hljs-comment\">\/\/ 2. Search logs within a specific date range<\/span>\n<span class=\"hljs-attr\">timestamp<\/span>:&#91;<span class=\"hljs-number\">20230101<\/span> TO <span class=\"hljs-number\">20230131<\/span>]\n\n<span class=\"hljs-comment\">\/\/ 3. Identify logs with 4xx client error response codes<\/span>\n<span class=\"hljs-attr\">response<\/span>:&#91;<span class=\"hljs-number\">400<\/span> TO <span class=\"hljs-number\">499<\/span>]\n\n<span class=\"hljs-comment\">\/\/ 4. Locate logs for requests to a specific URL<\/span>\n<span class=\"hljs-attr\">request<\/span>:<span class=\"hljs-string\">\"GET \/index.html HTTP\/1.1\"<\/span>\n\n<span class=\"hljs-comment\">\/\/ 5. Filter logs by a specific user-agent string<\/span>\n<span class=\"hljs-attr\">agent<\/span>:<span class=\"hljs-string\">\"Mozilla\/5.0 (Windows NT 10.0; Win64; x64)\"<\/span>\n\n<span class=\"hljs-comment\">\/\/ 6. Search for logs with a specific referrer<\/span>\n<span class=\"hljs-attr\">referrer<\/span>:<span class=\"hljs-string\">\"http:\/\/example.com\/\"<\/span>\n\n<span class=\"hljs-comment\">\/\/ 7. Find all logs of GET requests<\/span>\n<span class=\"hljs-attr\">request_method<\/span>:GET\n\n<span class=\"hljs-comment\">\/\/ 8. Filter logs resulting in 5xx server errors<\/span>\n<span class=\"hljs-attr\">response<\/span>:&#91;<span class=\"hljs-number\">500<\/span> TO <span class=\"hljs-number\">599<\/span>]\n\n<span class=\"hljs-comment\">\/\/ 9. Identify requests to a specific directory<\/span>\n<span class=\"hljs-attr\">request<\/span>:<span class=\"hljs-string\">\"\/images\/*\"<\/span>\n\n<span class=\"hljs-comment\">\/\/ 10. Locate requests taking longer than 2 seconds<\/span>\n<span class=\"hljs-attr\">duration<\/span>:&gt;<span class=\"hljs-number\">2000<\/span>\n\n<span class=\"hljs-comment\">\/\/ 11. Exclude logs from a specific IP address<\/span>\n-ip:<span class=\"hljs-string\">\"192.168.1.1\"<\/span>\n\n<span class=\"hljs-comment\">\/\/ 12. Find requests for a specific file type (.jpg)<\/span>\n<span class=\"hljs-attr\">request<\/span>:<span class=\"hljs-string\">\"*.jpg\"<\/span>\n\n<span class=\"hljs-comment\">\/\/ 13. Identify logs from a specific day<\/span>\n<span class=\"hljs-attr\">timestamp<\/span>:<span class=\"hljs-number\">20230115<\/span>\n\n<span class=\"hljs-comment\">\/\/ 14. Search logs with responses in a byte range<\/span>\n<span class=\"hljs-attr\">bytes<\/span>:&#91;<span class=\"hljs-number\">1000<\/span> TO <span class=\"hljs-number\">5000<\/span>]\n\n<span class=\"hljs-comment\">\/\/ 15. Filter logs by HTTP method and response code<\/span>\n<span class=\"hljs-attr\">request_method<\/span>:POST AND response:<span class=\"hljs-number\">200<\/span>\n\n<span class=\"hljs-comment\">\/\/ 16. Search for failed login attempts (custom log message)<\/span>\n<span class=\"hljs-attr\">message<\/span>:<span class=\"hljs-string\">\"Failed login attempt\"<\/span>\n\n<span class=\"hljs-comment\">\/\/ 17. Find logs from a range of IP addresses<\/span>\n<span class=\"hljs-attr\">ip<\/span>:&#91;<span class=\"hljs-number\">192.168<\/span><span class=\"hljs-number\">.1<\/span><span class=\"hljs-number\">.1<\/span> TO <span class=\"hljs-number\">192.168<\/span><span class=\"hljs-number\">.1<\/span><span class=\"hljs-number\">.100<\/span>]\n\n<span class=\"hljs-comment\">\/\/ 18. Identify logs with a 200 OK response<\/span>\n<span class=\"hljs-attr\">response<\/span>:<span class=\"hljs-number\">200<\/span>\n\n<span class=\"hljs-comment\">\/\/ 19. Search for logs with specific query parameters<\/span>\n<span class=\"hljs-attr\">request<\/span>:<span class=\"hljs-string\">\"*?user=john&amp;*\"<\/span>\n\n<span class=\"hljs-comment\">\/\/ 20. Locate logs with a 404 Not Found response<\/span>\n<span class=\"hljs-attr\">response<\/span>:<span class=\"hljs-number\">404<\/span>\n<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-1\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">JavaScript<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">javascript<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<script src=\"https:\/\/gist.github.com\/devops-school\/dd736c2642aa801eaf80458bd06b26e5.js\"><\/script>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"526\" src=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-77-1024x526.png\" alt=\"\" class=\"wp-image-45518\" srcset=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-77-1024x526.png 1024w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-77-300x154.png 300w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-77-768x395.png 768w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-77.png 1346w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"572\" src=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-78-1024x572.png\" alt=\"\" class=\"wp-image-45519\" srcset=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-78-1024x572.png 1024w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-78-300x168.png 300w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-78-768x429.png 768w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-78-740x414.png 740w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-78-355x199.png 355w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2024\/04\/image-78.png 1337w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>Lucene query cheatsheet Basic Search Boolean Operators Wildcard Searches Fuzzy Searches Proximity Searches Range Searches Regular Expressions Boosting Terms Field-Specific Searches Grouping How to search Apache HTTPD using Lucene These&#8230; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"_joinchat":[],"footnotes":""},"categories":[2],"tags":[],"class_list":["post-45514","post","type-post","status-publish","format-standard","hentry","category-uncategorised"],"_links":{"self":[{"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/45514","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/comments?post=45514"}],"version-history":[{"count":4,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/45514\/revisions"}],"predecessor-version":[{"id":45520,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/45514\/revisions\/45520"}],"wp:attachment":[{"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/media?parent=45514"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/categories?post=45514"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/tags?post=45514"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}