При работе с базой данных SQL вам может понадобиться найти записи, содержащие определенные строки. В этой статье мы разберем, как искать строки и подстроки в MySQL и SQL Server.
Содержание
- Использование операторов WHERE и LIKE для поиска подстроки
- Поиск подстроки в SQL Server с помощью функции CHARINDEX
- Поиск подстроки в SQL Server с помощью функции PATINDEX
- MySQL-запрос для поиска подстроки с применением функции SUBSTRING_INDEX()
Я буду использовать таблицу products_data
в базе данных products_schema
. Выполнение команды SELECT * FROM products_data
покажет мне все записи в таблице:
Поскольку я также буду показывать поиск подстроки в SQL Server, у меня есть таблица products_data
в базе данных products
:
Поиск подстроки при помощи операторов WHERE и LIKE
Оператор WHERE позволяет получить только те записи, которые удовлетворяют определенному условию. А оператор LIKE позволяет найти определенный шаблон в столбце. Эти два оператора можно комбинировать для поиска строки или подстроки.
Например, объединив WHERE с LIKE, я смог получить все товары, в которых есть слово «computer»:
SELECT * FROM products_data WHERE product_name LIKE '%computer%'
Знаки процента слева и справа от «computer» указывают искать слово «computer» в конце, середине или начале строки.
Если поставить знак процента в начале подстроки, по которой вы ищете, это будет указанием найти такую подстроку, стоящую в конце строки. Например, выполнив следующий запрос, я получил все продукты, которые заканчиваются на «er»:
SELECT * FROM products_data WHERE product_name LIKE '%er'
А если написать знак процента после искомой подстроки, это будет означать, что нужно найти такую подстроку, стоящую в начале строки. Например, я смог получить продукт, начинающийся на «lap», выполнив следующий запрос:
SELECT * FROM products_data WHERE product_name LIKE 'lap%'
Этот метод также отлично работает в SQL Server:
Поиск подстроки в SQL Server с помощью функции CHARINDEX
CHARINDEX() — это функция SQL Server для поиска индекса подстроки в строке.
Функция CHARINDEX() принимает 3 аргумента: подстроку, строку и стартовую позицию для поиска. Синтаксис выглядит следующим образом:
CHARINDEX(substring, string, start_position)
Если функция находит совпадение, она возвращает индекс, по которому найдено совпадение, а если совпадение не найдено, возвращает 0. В отличие от многих других языков, отсчет в SQL начинается с единицы.
Пример:
SELECT CHARINDEX('free', 'free is the watchword of freeCodeCamp') position;
Как видите, слово «free» было найдено на позиции 1. Это потому, что на позиции 1 стоит его первая буква — «f»:
Можно задать поиск с конкретной позиции. Например, если указать в качестве позиции 25, SQL Server найдет совпадение, начиная с текста «freeCodeCamp»:
SELECT CHARINDEX('free', 'free is the watchword of freeCodeCamp', 25);
При помощи CHARINDEX можно найти все продукты, в которых есть слово «computer», выполнив этот запрос:
SELECT * FROM products_data WHERE CHARINDEX('computer', product_name, 0) > 0
Этот запрос диктует следующее: «Начиная с индекса 0 и до тех пор, пока их больше 0, ищи все продукты, названия которых содержат слово «computer», в столбце product_name». Вот результат:
Поиск подстроки в SQL Server с помощью функции PATINDEX
PATINDEX означает «pattern index», т. е. «индекс шаблона». Эта функция позволяет искать подстроку с помощью регулярных выражений.
PATINDEX принимает два аргумента: шаблон и строку. Синтаксис выглядит следующим образом:
PATINDEX(pattern, string)
Если PATINDEX находит совпадение, он возвращает позицию этого совпадения. Если совпадение не найдено, возвращается 0. Вот пример:
SELECT PATINDEX('%ava%', 'JavaScript is a Jack of all trades');
Чтобы применить PATINDEX к таблице, я выполнил следующий запрос:
SELECT product_name, PATINDEX('%ann%', product_name) position FROM products_data
Но он только перечислил все товары и вернул индекс, под которым нашел совпадение:
Как видите, подстрока «ann» нашлась под индексом 3 продукта Scanner. Но скорее всего вы захотите, чтобы выводился только тот товар, в котором было найдено совпадение с шаблоном.
Чтобы обеспечить такое поведение, можно использовать операторы WHERE и LIKE:
SELECT product_name, PATINDEX('%ann%', product_name) position FROM products_data WHERE product_name LIKE '%ann%'
Теперь запрос возвращает то, что нужно.
MySQL-запрос для поиска строки с применением функции SUBSTRING_INDEX()
Помимо решений, которые я уже показал, MySQL имеет встроенную функцию SUBSTRING_INDEX(), с помощью которой можно найти часть строки.
Функция SUBSTRING_INDEX() принимает 3 обязательных аргумента: строку, разделитель и число. Числом обозначается количество вхождений разделителя.
Если вы укажете обязательные аргументы, функция SUBSTRING_INDEX() вернет подстроку до n-го разделителя, где n — указанное число вхождений разделителя. Вот пример:
SELECT SUBSTRING_INDEX("Learn on freeCodeCamp with me", "with", 1);
В этом запросе «Learn on freeCodeCamp with me» — это строка, «with» — разделитель, а 1 — количество вхождений разделителя. В этом случае запрос выдаст вам «Learn on freeCodeCamp»:
Количество вхождений разделителя может быть как положительным, так и отрицательным. Если это отрицательное число, то вы получите часть строки после указанного числа разделителей. Вот пример:
SELECT SUBSTRING_INDEX("Learn on freeCodeCamp with me", "with", -1);
От редакции Techrocks: также предлагаем почитать «Индексы и оптимизация MySQL-запросов».
Заключение
Из этой статьи вы узнали, как найти подстроку в строке в SQL, используя MySQL и SQL Server.
CHARINDEX() и PATINDEX() — это функции, с помощью которых можно найти подстроку в строке в SQL Server. Функция PATINDEX() является более мощной, так как позволяет использовать регулярные выражения.
Поскольку в MySQL нет CHARINDEX() и PATINDEX(), в первом примере мы рассмотрели, как найти подстроку в строке с помощью операторов WHERE и LIKE.
Перевод статьи «SQL Where Contains String – Substring Query Example».
This should ideally be done with the help of SQL Server full text search if using that.
However, if you can’t get that working on your DB for some reason, here is a performance-intensive solution:
-- table to search in
CREATE TABLE dbo.myTable
(
myTableId int NOT NULL IDENTITY (1, 1),
code varchar(200) NOT NULL,
description varchar(200) NOT NULL -- this column contains the values we are going to search in
) ON [PRIMARY]
GO
-- function to split space separated search string into individual words
CREATE FUNCTION [dbo].[fnSplit] (@StringInput nvarchar(max),
@Delimiter nvarchar(1))
RETURNS @OutputTable TABLE (
id nvarchar(1000)
)
AS
BEGIN
DECLARE @String nvarchar(100);
WHILE LEN(@StringInput) > 0
BEGIN
SET @String = LEFT(@StringInput, ISNULL(NULLIF(CHARINDEX(@Delimiter, @StringInput) - 1, -1),
LEN(@StringInput)));
SET @StringInput = SUBSTRING(@StringInput, ISNULL(NULLIF(CHARINDEX
(
@Delimiter, @StringInput
),
0
), LEN
(
@StringInput)
)
+ 1, LEN(@StringInput));
INSERT INTO @OutputTable (id)
VALUES (@String);
END;
RETURN;
END;
GO
-- this is the search script which can be optionally converted to a stored procedure /function
declare @search varchar(max) = 'infection upper acute genito'; -- enter your search string here
-- the searched string above should give rows containing the following
-- infection in upper side with acute genitointestinal tract
-- acute infection in upper teeth
-- acute genitointestinal pain
if (len(trim(@search)) = 0) -- if search string is empty, just return records ordered alphabetically
begin
select 1 as Priority ,myTableid, code, Description from myTable order by Description
return;
end
declare @splitTable Table(
wordRank int Identity(1,1), -- individual words are assinged priority order (in order of occurence/position)
word varchar(200)
)
declare @nonWordTable Table( -- table to trim out auxiliary verbs, prepositions etc. from the search
id varchar(200)
)
insert into @nonWordTable values
('of'),
('with'),
('at'),
('in'),
('for'),
('on'),
('by'),
('like'),
('up'),
('off'),
('near'),
('is'),
('are'),
(','),
(':'),
(';')
insert into @splitTable
select id from dbo.fnSplit(@search,' '); -- this function gives you a table with rows containing all the space separated words of the search like in this e.g., the output will be -
-- id
-------------
-- infection
-- upper
-- acute
-- genito
delete s from @splitTable s join @nonWordTable n on s.word = n.id; -- trimming out non-words here
declare @countOfSearchStrings int = (select count(word) from @splitTable); -- count of space separated words for search
declare @highestPriority int = POWER(@countOfSearchStrings,3);
with plainMatches as
(
select myTableid, @highestPriority as Priority from myTable where Description like @search -- exact matches have highest priority
union
select myTableid, @highestPriority-1 as Priority from myTable where Description like @search + '%' -- then with something at the end
union
select myTableid, @highestPriority-2 as Priority from myTable where Description like '%' + @search -- then with something at the beginning
union
select myTableid, @highestPriority-3 as Priority from myTable where Description like '%' + @search + '%' -- then if the word falls somewhere in between
),
splitWordMatches as( -- give each searched word a rank based on its position in the searched string
-- and calculate its char index in the field to search
select myTable.myTableid, (@countOfSearchStrings - s.wordRank) as Priority, s.word,
wordIndex = CHARINDEX(s.word, myTable.Description) from myTable join @splitTable s on myTable.Description like '%'+ s.word + '%'
-- and not exists(select myTableid from plainMatches p where p.myTableId = myTable.myTableId) -- need not look into myTables that have already been found in plainmatches as they are highest ranked
-- this one takes a long time though, so commenting it, will have no impact on the result
),
matchingRowsWithAllWords as (
select myTableid, count(myTableid) as myTableCount from splitWordMatches group by(myTableid) having count(myTableid) = @countOfSearchStrings
)
, -- trim off the CTE here if you don't care about the ordering of words to be considered for priority
wordIndexRatings as( -- reverse the char indexes retrived above so that words occuring earlier have higher weightage
-- and then normalize them to sequential values
select s.myTableid, Priority, word, ROW_NUMBER() over (partition by s.myTableid order by wordindex desc) as comparativeWordIndex
from splitWordMatches s join matchingRowsWithAllWords m on s.myTableId = m.myTableId
)
,
wordIndexSequenceRatings as ( -- need to do this to ensure that if the same set of words from search string is found in two rows,
-- their sequence in the field value is taken into account for higher priority
select w.myTableid, w.word, (w.Priority + w.comparativeWordIndex + coalesce(sequncedPriority ,0)) as Priority
from wordIndexRatings w left join
(
select w1.myTableid, w1.priority, w1.word, w1.comparativeWordIndex, count(w1.myTableid) as sequncedPriority
from wordIndexRatings w1 join wordIndexRatings w2 on w1.myTableId = w2.myTableId and w1.Priority > w2.Priority and w1.comparativeWordIndex>w2.comparativeWordIndex
group by w1.myTableid, w1.priority,w1.word, w1.comparativeWordIndex
)
sequencedPriority on w.myTableId = sequencedPriority.myTableId and w.Priority = sequencedPriority.Priority
),
prioritizedSplitWordMatches as ( -- this calculates the cumulative priority for a field value
select w1.myTableId, sum(w1.Priority) as OverallPriority from wordIndexSequenceRatings w1 join wordIndexSequenceRatings w2 on w1.myTableId = w2.myTableId
where w1.word <> w2.word group by w1.myTableid
),
completeSet as (
select myTableid, priority from plainMatches -- get plain matches which should be highest ranked
union
select myTableid, OverallPriority as priority from prioritizedSplitWordMatches -- get ranked split word matches (which are ordered based on word rank in search string and sequence)
),
maximizedCompleteSet as( -- set the priority of a field value = maximum priority for that field value
select myTableid, max(priority) as Priority from completeSet group by myTableId
)
select priority, myTable.myTableid , code, Description from maximizedCompleteSet m join myTable on m.myTableId = myTable.myTableId
order by Priority desc, Description -- order by priority desc to get highest rated items on top
--offset 0 rows fetch next 50 rows only -- optional paging
If you’re working with a database, whether large or small, there might be occasions when you need to search for some entries containing strings.
In this article, I’ll show you how to locate strings and substrings in MySQL and SQL Server.
I‘ll be using a table I call products_data
in a products_schema
database. Running SELECT * FROM products_data
shows me all the entries in the table:
Since I’ll be showing you how to search for a string in SQL Server too, I have the products_data
table in a products
database:
What We’ll Cover
- How to Query for Strings in SQL with the
WHERE
Clause andLIKE
Operator - How to Query for Strings in SQL Server with the
CHARINDEX
Function - How to Query for Strings in SQL Server with the
PATINDEX
Function - How to Query for Strings in MySQL with the
SUBSTRING_INDEX()
Function - Conclusion
How to Query for Strings in SQL with the WHERE
Clause and LIKE
Operator
The WHERE
clause lets you get only the records that meet a particular condition. The LIKE
operator, on the other hand, lets you find a particular pattern in a column. You can combine these two to search for a string or a substring of a string.
I was able to get all the products that have the word “computer” in them by combining the WHERE
clause and LIKE
operator by running the query below:
SELECT * FROM products_data
WHERE product_name LIKE '%computer%'
The percentage sign before and after the word “computer” means, find the word “computer” whether it’s in the end, middle, or start.
So, if you put the percentage sign at the start of a substring you’re searching by, it means, find that substring at the end of a string. For Example, I got every product that ends with “er” by running this query:
SELECT * FROM products_data
WHERE product_name LIKE '%er'
And if it’s at the end of a string, it means, find that substring at the start of a string. For example, I was able to get the product that starts with “lap” with this query:
SELECT * FROM products_data
WHERE product_name LIKE 'lap%'
This method also works fine in SQL Server:
How to Query for Strings in SQL Server with the CHARINDEX
Function
CHARINDEX() is an SQL server function for finding the index of a substring in a string.
The CHARINDEX()
function takes 3 arguments – the substring, the string, and the starting position. The syntax looks like this:
CHARINDEX(substring, string, start_position)
If it finds a match, it returns the index where it finds the match, but if it doesn’t find a match, it returns 0. Unlike many other languages, counting in SQL is 1-based.
Here’s an example:
SELECT CHARINDEX('free', 'free is the watchword of freeCodeCamp') position;
You can see the word free was found in position 1. That’s because ‘f’ itself is at position 1:
If I specify 25 as the position, SQL Server would find a match starting from the “freeCodeCamp” text:
SELECT CHARINDEX('free', 'free is the watchword of freeCodeCamp', 25);
I was able to use the CHARINDEX
function to search for all products that have the word “computer” in them by running this query:
SELECT * FROM products_data WHERE CHARINDEX('computer', product_name, 0) > 0
That query is saying, start from index 0, as long as they’re more than 0, get me every product that has the word “computer” in them in the product_name
column. This is the result:
How to Query for Strings in SQL Server with the PATINDEX
Function
PATINDEX
stands for “pattern index”. So, with this function, you can search for a substring with regular expressions.
PATINDEX
takes two arguments – the pattern and the string. The syntax looks like this:
PATINDEX(pattern, string)
If PATINDEX
finds a match, it returns the position of that match. If it doesn’t find a match, it returns 0. Here’s an example:
SELECT PATINDEX('%ava%', 'JavaScript is a Jack of all trades');
To apply PATINDEX
to the example table, I ran this query:
SELECT product_name, PATINDEX('%ann%', product_name) position
FROM products_data
But it only listed every product and returned the index where it found the match:
You can see it found the word “ann” at index 3 of the product Scanner. On many occasions, you might not want this behavior because you would want it to show only the item matched.
I made it return only what gets matched by using the WHERE
clause and LIKE
operator:
SELECT product_name, PATINDEX('%ann%', product_name) position
FROM products_data
WHERE product_name LIKE '%ann%'
Now it’s behaving as you would want.
How to Query for Strings in MySQL with the SUBSTRING_INDEX()
Function
Apart from the solutions I’ve already shown you, MySQL has an inbuilt SUBSTRING_INDEX()
function with which you can find a part of a string.
The SUBSTRING_INDEX()
function takes 3 compulsory arguments – the string, the substring to search for, and a delimiter. The delimiter has to be a number.
When you specify the compulsory arguments, the SUBSTRING_INDEX()
function will get you every part of the string that occurs before the delimiter you specify. Here’s an example:
SELECT SUBSTRING_INDEX("Learn on freeCodeCamp with me", "with", 1);
In the query above, “Learn on freeCodeCamp with me” is the string, “with” is the substring and 1 is the delimiter. In this case, the query will get you “Learn on freeCodeCamp”:
The delimiter can also be a negative number. If it’s a negative number, it gets you each part of the string that occurs after the delimiter you specify. Here’s an example:
SELECT SUBSTRING_INDEX("Learn on freeCodeCamp with me", "with", -1);
Conclusion
This article showed you how to locate a substring in a string in SQL using both MySQL and SQL Server.
CHARINDEX()
and PATINDEX()
are the functions with which you can search for a substring in a string inside SQL Server. PATINDEX()
is more powerful because it lets you use regular expressions.
Since CHARINDEX()
and PATINDEX()
don’t exist in MySQL, the first example showed you how you can find a substring in a string with the WHERE
clause and LIKE
operator.
Thank you for reading!
Learn to code for free. freeCodeCamp’s open source curriculum has helped more than 40,000 people get jobs as developers. Get started
Здравствуйте, дорогие читатели! В этой статье я хочу показать вам, как найти подстроку в строке на SQL.
Найти подстроку в строке на SQL очень просто, для этого есть уже готовая функция – locate.
Синтаксис
SELECT LOCATE('what', 'where') FROM table;
Такой запрос вернет позицию первого вхождения подстроки в строку.
Если же результат не будет найден, то вернется 0.
Пример
Например, у нас есть поле с описанием какого-нибудь курса, и мы хотим найти там строку PHP. Вот как мы можем это сделать:
SELECT * FROM courses WHERE LOCATE('php', description);
В результате мы получим строку, где будет найдено слово php.
Итак, на этом все. Спасибо за внимание и удачных вам запросов!
-
Создано 28.07.2014 20:48:05
-
Михаил Русаков
Копирование материалов разрешается только с указанием автора (Михаил Русаков) и индексируемой прямой ссылкой на сайт (http://myrusakov.ru)!
Добавляйтесь ко мне в друзья ВКонтакте: http://vk.com/myrusakov.
Если Вы хотите дать оценку мне и моей работе, то напишите её в моей группе: http://vk.com/rusakovmy.
Если Вы не хотите пропустить новые материалы на сайте,
то Вы можете подписаться на обновления: Подписаться на обновления
Если у Вас остались какие-либо вопросы, либо у Вас есть желание высказаться по поводу этой статьи, то Вы можете оставить свой комментарий внизу страницы.
Если Вам понравился сайт, то разместите ссылку на него (у себя на сайте, на форуме, в контакте):
-
Кнопка:
Она выглядит вот так:
-
Текстовая ссылка:
Она выглядит вот так: Как создать свой сайт
- BB-код ссылки для форумов (например, можете поставить её в подписи):
In this post, let us see how to search for a string / phrase in SQL Server database using hybrid solution of T-SQL LIKE operator & R grep function. Currently the options that exists in SQL Server to perform a search operation are
Consider below example: To search and return only records with string “VAT” . Expected result is to return record 1,5 & 6.
DECLARE
@Tmp
TABLE
(Id
INT
, Descrip
VARCHAR
(500))
INSERT
@Tmp
SELECT
1,
'my VAT calculation is incorrect'
INSERT
@Tmp
SELECT
2,
'Private number'
INSERT
@Tmp
SELECT
3,
'Innnovation model'
INSERT
@Tmp
SELECT
4,
'ELEVATE'
INSERT
@Tmp
SELECT
5,
'total VAT'
INSERT
@Tmp
SELECT
6,
'VAT'
SELECT
*
FROM
@Tmp
WHERE
Descrip
LIKE
'VAT'
SELECT
*
FROM
@Tmp
WHERE
Descrip
LIKE
'%VAT'
SELECT
*
FROM
@Tmp
WHERE
Descrip
LIKE
'%VAT%'
SELECT
*
FROM
@Tmp
WHERE
Descrip
LIKE
'% VAT %'
SELECT
*
FROM
@Tmp
WHERE
Descrip
LIKE
'% VAT'
As shown in above example, to do an exact search on string, there is no straight forward option using first two options mentioned above. However though it is possible with third option using Full text CONTAINS predicate. Full text catalog, unique index
& full text index has to be created on the table on which search operation needs to be performed.
If the exact search of string needs to be performed on the entire database then creating full text catalog, unique index & full text index on each and every table won’t be a viable option.
With the hybrid approach [
T-SQL LIKE operator & R grep function], let us see various search types that can be performed
[
Pattern Search, Exact Search, Multi pattern search and other search scenario’s – based on collation, case sensitive/insensitive search and complex wildcard search].
We have used
SQL Server 2019 evaluation edition on Windows 10 64 bit and
WideWorldImporters SQL Server sample database for this example. In this example, we have made use of R services installed as part of SQL Server.
Install R services and then from SSMS enable the external scripting feature. Restart the database engine and then verify the installation as mentioned in MSDN.
Below script / this approach will work starting from SQL Server 2016 and above (as execution of R language using T-SQL was introduced in SQL Server 2016). Also please note, no additional R packages need to be installed for this approach.
A stored procedure named “usp_SearchString” has been created. This stored procedure has the capability to do normal T-SQL LIKE operations as well as can search string using R grep function and this can be controlled through input parameter.
Output of the search operation will be stored in a table named “Tbl_SearchString”. Also output will be displayed at the end of stored procedure execution.
Below are the various input parameters of stored procedure and it’s usage details:
If both @ObjectlisttoSearch & @SchemaName are blank then entire database is searched including SQL object definitions
@ObjectlisttoSearch, @SchemaName should always be delimited by comma if multiple values specified.
USE
[
WideWorldImporters]
GO
--Note : Before compiling this SP, search for sqlConnString and provide Databasename, username & password for R SQL connection
CREATE OR ALTER PROC usp_SearchString ( @SearchString NVARCHAR(MAX),
@SearchType VARCHAR(
4
),
@Match BIT,
@IgnoreCase BIT,
@SearchSQLMetadata CHAR(
1
),
@SchemaName NVARCHAR(
50
),
@ObjectlisttoSearch NVARCHAR(MAX),
@SearchCollate NVARCHAR(
500
)
)
/*************************************************************************
=================
INPUT PARAMETERS:
=================
@SearchString - String to be searched
@SearchType - ES - Exact Search using R
PS - Pattern Search using R
MPS - Multi Pattern Search - OR condition using R
NTLS - Normal T-SQL Like Search
@Match - 0 = LIKE Search, 1 = NOT LIKE Search
@IgnoreCase - 1 = case insensitive search, 0 = Case sensitive search (If @IgnoreCase IS NULL then default : case insensitive search)
@SearchSQLMetadata - Search sql definitions for presence of input string. 1 = Search, 0 = Don't Search
@SchemaName - List of objects to be searched that fall under schema (Multiple schema's can be passed, separated by Comma)
@ObjectlisttoSearch - List of objects to be searched (Multiple table's can be passed, separated by Comma)
--IF BOTH @ObjectlisttoSearch & @SchemaName ARE BLANK THEN ENTIRE DATABASE IS SEARCHED INCLUDING SQL DEFINITIONS
@SearchCollate - For @SearchType = NTLS if @IgnoreCase = 0. To search based on particular collation, default - COLLATE Latin1_General_CS_AS
*****************************************************************************/
AS
BEGIN
SET NOCOUNT ON;
IF @SearchType IN (
'ES'
,
'PS'
,
'MPS'
,
'NTLS'
)
BEGIN
DECLARE @ExecutedBy NVARCHAR(
200
) = CURRENT_USER
DECLARE @Serv NVARCHAR(
200
) = CONCAT(CHAR(
39
),CHAR(
39
),@@SERVERNAME,CHAR(
39
),CHAR(
39
))
IF ISNULL(@SchemaName,
''
) <>
''
OR ISNULL(@ObjectlisttoSearch,
''
) <>
''
BEGIN
/**** List of table columns to be searched ****/
DECLARE @TableColList TABLE (Cols NVARCHAR(MAX),colname NVARCHAR(
200
),Tbl NVARCHAR(
128
),TblCol
NVARCHAR(100
),ColType NVARCHAR(
150
))
INSERT @TableColList
SELECT
CASE WHEN TY.name IN (
'date'
,
'datetime2'
,
'datetimeoffset'
,
'time'
,
'timestamp'
)
THEN CONCAT(
'TRY_CONVERT('
,
'VARCHAR(MAX),'
,C.name,
')
AS ',QUOTENAME(C.NAME))
ELSE C.name END Columns -- To cover poor data type conversions when passed to R dataframe
,C.name
,CONCAT(SCHEMA_NAME(T.SCHEMA_ID),
'.'
,T.name) TableName
,CONCAT(SCHEMA_NAME(T.SCHEMA_ID),
'.'
,T.name,
'.'
,C.name)
TblCol
,TY.name
FROM Sys.tables T
JOIN sys.columns C
ON T.object_id = C.object_id
JOIN sys.types TY
ON C.
[
user_type_id] = TY.[
user_type_id]
-- Ignore the datatypes that are not required
WHERE TY.name NOT IN (
'geography'
,
'varbinary'
,
'binary'
,
'text'
,
'ntext'
,
'image'
,
'hierarchyid'
,
'xml'
,
'sql_variant'
)
AND (Schema_name(T.schema_id) IN (SELECT value FROM STRING_SPLIT(@SchemaName,
','
))
OR CONCAT(SCHEMA_NAME(T.SCHEMA_ID),
'.'
,T.name) IN (SELECT value FROM STRING_SPLIT(@ObjectlisttoSearch,
','
)))
END ELSE
BEGIN
INSERT @TableColList
SELECT
CASE WHEN TY.name IN (
'date'
,
'datetime2'
,
'datetimeoffset'
,
'time'
,
'timestamp'
)
THEN CONCAT(
'TRY_CONVERT('
,
'VARCHAR(MAX),'
,C.name,
')
AS ',QUOTENAME(C.NAME))
ELSE C.name END Columns -- To cover poor data type conversions when passed to R dataframe
,C.name
,CONCAT(SCHEMA_NAME(T.SCHEMA_ID),
'.'
,T.name) TableName
,CONCAT(SCHEMA_NAME(T.SCHEMA_ID),
'.'
,T.name,
'.'
,C.name)
TblCol
,TY.name
FROM Sys.tables T
JOIN sys.columns C
ON T.object_id = C.object_id
JOIN sys.types TY
ON C.
[
user_type_id] = TY.[
user_type_id]
-- Ignore the datatypes that are not required
WHERE TY.name NOT IN (
'geography'
,
'varbinary'
,
'binary'
,
'text'
,
'ntext'
,
'image'
,
'hierarchyid'
,
'xml'
,
'sql_variant'
)
END
DROP TABLE IF EXISTS #ExportTablesList
CREATE TABLE #ExportTablesList (Rn BIGINT IDENTITY(
1
,
1
),cols
NVARCHAR(500
),colname NVARCHAR(
200
),tbl NVARCHAR(
200
),ColType
NVARCHAR(200
))
IF @SearchSQLMetadata =
1
OR (@SearchSQLMetadata <>
0
AND (ISNULL(@SchemaName,
''
) =
''
AND ISNULL(@ObjectlisttoSearch,
''
) =
''
))
BEGIN
INSERT #ExportTablesList (cols,tbl,ColType) SELECT
'CONCAT('
'<'
',object_schema_name(sm.object_id),'
'.'
',object_name(sm.object_id),'
'|'
',o.type_desc
COLLATE Latin1_General_100_CI_AS,''>'
',sm.definition) AS definition'
,
'sys.sql_modules AS sm JOIN sys.objects AS o ON sm.object_id = o.object_id'
,
'sql_modules'
END
--Deduplication of object list
;WITH dedup
AS
(
SELECT *,ROW_NUMBER()OVER(PARTITION BY Tbl,Cols ORDER BY Cols) Rn FROM @TableColList
)
INSERT INTO #ExportTablesList
SELECT cols,colname,tbl,ColType FROM dedup
WHERE Rn =
1
AND tbl <>
'dbo.Tbl_SearchString'
/**** List of table columns to be searched ****/
IF (SELECT COUNT(
1
) FROM #ExportTablesList) <>
0
BEGIN
--Table to hold search output
IF NOT EXISTS (SELECT
1
FROM sys.tables WHERE name =
'Tbl_SearchString'
)
BEGIN
CREATE TABLE
[
dbo].[
Tbl_SearchString] (
[
RunId] FLOAT,
[
SearchIndex] BIGINT,
[
SearchValue] NVARCHAR(MAX),
[
NoOfOccurance] FLOAT,
[
ObjectName] NVARCHAR(200
),
[
ColumnNameORDefinition] NVARCHAR(200
),
[
SqlDatatype] NVARCHAR(200
),
[
InputParameter] NVARCHAR(800
),
[
ExecutedBy] NVARCHAR(200
),
[
ExecutedAt] DATETIME
)
END
DECLARE @RunId FLOAT
SELECT @RunId = COALESCE(MAX(
[
RunId]),0
)+
1
FROM
[
dbo].[
Tbl_SearchString]
--Processing to store input parameters
DECLARE @Input NVARCHAR(MAX) = CONCAT(
'@SearchString > '
,CASE WHEN @SearchString =
''
OR @SearchString IS NULL THEN
'NULL'
ELSE @SearchString END
,
',@SearchType > '
,CASE WHEN @SearchType =
''
OR @SearchType IS NULL THEN
'NULL'
ELSE @SearchType END
,
',@Match > '
,COALESCE(@Match,
0
)
,
',@IgnoreCase > '
,COALESCE(@IgnoreCase,
1
)
,
',@SearchSQLMetadata > '
,CASE WHEN @SearchSQLMetadata =
''
OR @SearchSQLMetadata IS NULL THEN
'NULL'
ELSE @SearchSQLMetadata END
,
',@SchemaName > '
,CASE WHEN @SchemaName =
''
OR @SchemaName IS NULL THEN
'NULL'
ELSE @SchemaName END
,
',@ObjectlisttoSearch > '
,CASE WHEN @ObjectlisttoSearch =
''
OR @ObjectlisttoSearch IS NULL THEN
'NULL'
ELSE @ObjectlisttoSearch END)
--By
default
case insensitive search
SELECT @IgnoreCase = COALESCE(@IgnoreCase,
1
)
--By
default
LIKE search
SELECT @Match = COALESCE(@Match,
0
)
IF @SearchType =
'NTLS'
BEGIN
DECLARE @SearchStrings TABLE (Id INT IDENTITY(
1
,
1
),String
NVARCHAR(MAX))
INSERT @SearchStrings
SELECT value FROM STRING_SPLIT(@SearchString,
'|'
)
UPDATE #ExportTablesList SET Tbl =
'sys.sql_modules'
, colname =
'definition'
WHERE ColType =
'sql_modules'
SET @SearchCollate = CASE WHEN @SearchCollate =
''
THEN NULL ELSE @SearchCollate END
DECLARE @COLLATE NVARCHAR(
100
)
SET @COLLATE = CASE WHEN @IgnoreCase =
0
THEN CASE WHEN @SearchCollate =
''
OR @SearchCollate IS NULL THEN
' COLLATE Latin1_General_CS_AS '
ELSE CONCAT(
' COLLATE '
,@SearchCollate,
'
') END
ELSE CHAR(
32
) END
DECLARE @SearchOperator NVARCHAR(
100
)
SET @SearchOperator = CASE WHEN @Match =
1
THEN
' NOT LIKE '
ELSE
' LIKE '
END
DECLARE @WHEREClause NVARCHAR(MAX)
;WITH CTE
AS
(
SELECT
'SearchValue '
+ @SearchOperator +
''
''
+String+
''
''
+@COLLATE
WhereClause FROM @SearchStrings
)
SELECT @WHEREClause = STUFF(
(SELECT
' OR '
+ WhereClause FROM
(SELECT WhereClause FROM CTE ) AS T FOR XML PATH(
''
)),
2
,
2
,
''
)
END
SET @SearchString = CASE WHEN @SearchType =
'ES'
THEN REPLACE(@SearchString,
'"'
,
''
) ELSE @SearchString
END
/**** Loop through above Objects list and execute R script ****/
DECLARE @I INT =
1
,@SQL NVARCHAR(MAX) = N
''
,@RScript NVARCHAR(MAX) = N
''
,@tblname NVARCHAR(
128
)
,@Colname NVARCHAR(
200
)
,@Sqltype NVARCHAR(
100
)
WHILE @I <= (SELECT MAX(Rn) FROM #ExportTablesList)
BEGIN
SELECT @SQL = CONCAT(
'SELECT '
,Cols,
'
FROM ',tbl)
,@tblname = Tbl
,@Colname = CASE WHEN @SearchType IN (
'ES'
,
'PS'
)
THEN cols ELSE colname END
,@Sqltype = ColType
FROM #ExportTablesList WHERE Rn = @I
IF @SearchType IN (
'ES'
,
'PS'
,
'MPS'
)
BEGIN
SET @RScript = '
#Provide DB credential detail for storing output in a table
sqlConnString <-
"Driver=SQL Server;Server=serv; Database=WideWorldImporters;Uid=sa;Pwd=password"
#function to count no of occurences
countCharOccurrences <- function(char,string,Type) {
if (Type ==
"ES"
)
{
Boundchar <- paste
0
(
"\b"
,char,
"\b"
,sep
=""
)
string
1
<- gsub(Boundchar,
""
,string,ignore.case=IgnoreCase)
}
string
1
<- gsub(char,
""
,string,ignore.case=IgnoreCase)
return ((nchar(string) - nchar(string
1
))/nchar(char))
}
#getting input dataset column name into a variable
"c"
c <- colnames(InputDataSet)
if (SearchType ==
"ES"
)
{
ExactString <- paste
0
(
"\b"
,SearchString,
"\b"
,sep
=""
)
Output <- as.data.frame(grep(ExactString,InputDataSet
[
[
c]],ignore.case = IgnoreCase,invert
= Match))
colnames(Output)
[
1
] <-
"SearchIndex"
Output$SearchValue <- grep(ExactString,InputDataSet
[
[
c]],ignore.case = IgnoreCase,value = TRUE,invert
= Match)
Output$NoOfOccurance <- countCharOccurrences(SearchString,Output$SearchValue,SearchType)
}
if (SearchType ==
"PS"
|| SearchType ==
"MPS"
)
{
Output <- as.data.frame(grep(SearchString,InputDataSet
[
[
c]],ignore.case = IgnoreCase,invert
= Match))
colnames(Output)
[
1
] <-
"SearchIndex"
Output$SearchValue <- grep(SearchString,InputDataSet
[
[
c]],ignore.case = IgnoreCase,value = TRUE,invert
= Match)
if (SearchType ==
"PS"
) {
Output$NoOfOccurance <- countCharOccurrences(SearchString,Output$SearchValue,SearchType) }
}
Output$ObjectName <- rep(tblname,nrow(Output))
Output$ColumnNameORDefinition <- rep(c,nrow(Output))
Output$SqlDatatype <- rep(Sqltype,nrow(Output))
Output$ObjectName
[
Output$SqlDatatype ==
"sql_modules"
] <-
"sql_modules"
Output$InputParameter <- rep(Input,nrow(Output))
Output$ExecutedBy <- rep(ExecutedBy,nrow(Output))
Output$ExecutedAt <- rep(
format
(Sys.time(),usetz = FALSE),nrow(Output))
Output$RunId <- rep(RunId,nrow(Output))
sqlDS <- RxSqlServerData(connectionString = sqlConnString,table =
"Tbl_SearchString"
)
rxDataStep(inData = Output, outFile = sqlDS,append =
"rows"
)
'
EXEC sp_execute_external_script
@language = N
'R'
,@script = @RScript
,@input_data_
1
= @SQL
,@params = N'@SearchString NVARCHAR(MAX),@SearchType VARCHAR(
4
),@Match
BIT,@IgnoreCase BIT,@Input NVARCHAR(MAX)
,@tblname NVARCHAR(
128
),@Sqltype NVARCHAR(
150
),@ExecutedBy
NVARCHAR(200
),@RunId FLOAT
,@Serv NVARCHAR(
200
)'
,@SearchString = @SearchString
,@SearchType = @SearchType
,@Match = @Match
,@IgnoreCase = @IgnoreCase
,@Input = @Input
,@tblname = @tblname
,@Sqltype = @Sqltype
,@ExecutedBy = @ExecutedBy
,@RunId = @RunId
,@Serv = @Serv
END
IF @SearchType =
'NTLS'
BEGIN
INSERT
[
dbo].[
Tbl_SearchString]([
RunId],[
SearchIndex],[
SearchValue],[
ObjectName]
,
[
ColumnNameORDefinition],[
SqlDatatype],[
InputParameter],[
ExecutedBy],[
ExecutedAt])
EXEC (
'SELECT '
+@RunId+
',SearchIndex,SearchValue,'
''
+@tblname+
''
','
''
+@Colname+
''
','
''
+@Sqltype+
''
','
''
+@Input+
''
','
''
+@ExecutedBy+
''
',
GETDATE()
FROM (SELECT ROW_NUMBER()OVER(ORDER BY (SELECT
1
)) SearchIndex,
'+@Colname+'
AS SearchValue FROM '+@tblname+
' ) Tmp WHERE '
+@WHEREClause)
END
SET @I = @I +
1
END
/**** Loop through above table list and execute R script ****/
--Display final search result
SELECT * FROM
[
dbo].[
Tbl_SearchString] WHERE RunId = @RunId AND ExecutedBy = CURRENT_USER
END
ELSE
SELECT
'No valid objects passed in the InputParameter to search the string'
AS InvalidParameter
END
ELSE
SELECT 'SearchType parameter is mandatory ES - Exact Search, PS - Pattern Search,MPS - Multi Pattern Search - OR condition
,NTLS - Normal T-SQL Like Search' AS InvalidParameter
END
EXEC
usp_SearchString @SearchString =
'VAT'
,@SearchType =
'ES'
,@Match = 0
-- 0 = LIKE, 1 = NOT LIKE
,@IgnoreCase = 1
-- 1 = Case insensitive, 0 = Case Sensitive
,@SearchSQLMetadata= 0
-- 1 = Search, 0 = Don't Search
,@SchemaName = '
'
,@ObjectlisttoSearch = '
dbo.Tmp
'
,@SearchCollate = '
'
Example 1: If we want to search for a string “Ava” on Application.People table from WideWorldImporters database, we can try by setting parameters values as shown below :
In example 1, we did a pattern search. If we want to do a exact search for a string “Ava” on Application.People table from WideWorldImporters database, we can try by setting parameters values as shown below :
Example 3: In example 2, we did a exact search. If we want to do a exact case sensitive search for a string “Ava” on Application.People table from WideWorldImporters database, we can try by setting parameters values as shown below :
If we want to do a exact case sensitive search for a string “male” on Purchasing.PurchaseOrderLines table from WideWorldImporters database, we can try by setting parameters values as shown below :
In example 4, it returned two rows (records that contains both “male” and “female”) as the @SearchType was set NTLS (Normal T-SQL Like search). But we actually expect the result to be one row with record that contain only “male”. If we again do a exact case
sensitive search for a string “male” on Purchasing.PurchaseOrderLines table from WideWorldImporters database, by setting @SearchType = ‘ES’ shown below :
If we want to do a multi string search delimited by pipe (search string “Ava”,”Amy”) on Application.People table from WideWorldImporters database, we can try by setting parameters values as shown below :
Please note NoOfOccurance field will be populated only for @SearchType = “ES” & “PS” (for single string search without wildcard).
In example 6, multi string search was done using R script. If we want to do the same multi string search using normal T-SQL LIKE search, we can try by setting parameters values as shown below :
Also in this example, let us see how we can search string with specific collation setting.
If we want to do fixed pattern search say if we know the string to be searched is a two letter alphabet then we can try by setting parameters values as shown below :
Example 9: If we want to search for a phrase from multiple tables delimited by comma then we can try by setting parameters values as shown below :
Example 10: If we want to search for a date from multiple schema’s delimited by comma then we can try by setting parameters values as shown below :
Example 11: If we want to search for a string called “Password” in entire database including SQL object definitions then we can try by setting parameters values as shown below :
Example 12: Below example shows how we can do wildcard search when the search is done using R script, refer parameters values as shown below :
To know more about R wild card search using “?”,”*”,”^”,”$”, please see the link provided in reference section.
Tbl_SearchString stores the details about search made on a string. If we want to see the entire details (all the other fields from a table) of the record that matches the searched string then we can try like shown below:
--To get deails of particular RunId
SELECT
DISTINCT
RunId
,
[
ObjectName]
FROM
[
WideWorldImporters].[
dbo].[
Tbl_SearchString]
WHERE
RunId = 12
SELECT
A.*
,B.*
FROM
[
WideWorldImporters].[
dbo].[
Tbl_SearchString] A
JOIN
(SELECT
row_number()over(order
by
(SELECT
'A')) Rn,*
FROM
Warehouse.StockItemHoldings ) B
--Change table name
ON
A.SearchIndex = B.Rn
AND
A.ObjectName =
'Warehouse.StockItemHoldings'
--Change table name
AND
RunId = 12
--provide run id
R & Python language extension was introduced in SQL Server 2016 & 2017 as part of machine learning. With
support of R in Azure SQL database, this new approach can be used extensively as it easy, flexible and supported in both On-premise & Azure SQL database.
This post is just to give an overview of this new approach for searching strings that resides in any corner of the SQL Server database using T-SQL / R script. Based on specific requirement tweaking the solution mentioned above (with other powerful R string
packages / glob2rx) can cover any scenario.