Как найти повторяющиеся слова в строке питон

I am trying to make my function locate duplicate words and if so the output should be True or False depending on wether there are duplicate words. For example:

doubleWord("cat") --> False .      
doubleWord("catcat") --> True .   
doubleWord("contour" * 2) --> True

So far I have this:

def main():

    word = input("Enter a string: ")

    half = len(word) >> 1
    if word[:half] == word[half:]:
        print("True")
    else:
        print("False")

    return
    print(main())

if name == “main“:
main()

Any help would be greatly appreciated. I thought maybe using slicing would make it easier but I have no idea how to implement that in my code. Thanks!

asked Nov 5, 2016 at 21:41

Jane Doe2Jane Doe2

1411 gold badge2 silver badges15 bronze badges

You just have to compare the first part with the second, you can do this with slicing like this:

def doubleWord(word):
    return word[len(word) // 2:] == word[:len(word) // 2]

answered Nov 5, 2016 at 21:45

FranciscoFrancisco

10.8k6 gold badges34 silver badges45 bronze badges

Источник

Following example:

string1 = "calvin klein design dress calvin klein"

How can I remove the second two duplicates "calvin" and "klein"?

The result should look like

string2 = "calvin klein design dress"

only the second duplicates should be removed and the sequence of the words should not be changed!

Martin Thoma

122k155 gold badges604 silver badges939 bronze badges

asked Oct 17, 2011 at 13:08

string1 = "calvin klein design dress calvin klein"
words = string1.split()
print (" ".join(sorted(set(words), key=words.index)))

This sorts the set of all the (unique) words in your string by the word’s index in the original list of words.

answered Oct 17, 2011 at 13:40

MarkusMarkus

3,4173 gold badges23 silver badges26 bronze badges

def unique_list(l):
    ulist = []
    [ulist.append(x) for x in l if x not in ulist]
    return ulist

a="calvin klein design dress calvin klein"
a=' '.join(unique_list(a.split()))

answered Oct 17, 2011 at 13:12

spicavigospicavigo

4,09822 silver badges28 bronze badges

In Python 2.7+, you could use collections.OrderedDict for this:

from collections import OrderedDict
s = "calvin klein design dress calvin klein"
print ' '.join(OrderedDict((w,w) for w in s.split()).keys())

answered Oct 17, 2011 at 13:21

NPENPE

483k107 gold badges943 silver badges1008 bronze badges

Cut and paste from the itertools recipes

from itertools import ifilterfalse

def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in ifilterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

I really wish they could go ahead and make a module out of those recipes soon. I’d very much like to be able to do from itertools_recipes import unique_everseen instead of using cut-and-paste every time I need something.

Use like this:

def unique_words(string, ignore_case=False):
    key = None
    if ignore_case:
        key = str.lower
    return " ".join(unique_everseen(string.split(), key=key))

string2 = unique_words(string1)

answered Oct 17, 2011 at 13:22

string2 = ' '.join(set(string1.split()))

Explanation:

.split() – it is a method to split string to list (without params it split by spaces)
set() – it is type of unordered collections that exclude dublicates
'separator'.join(list) – mean that you want to join list from params to string with ‘separator’ between elements

answered Nov 9, 2018 at 8:33

string = 'calvin klein design dress calvin klein'

def uniquify(string):
    output = []
    seen = set()
    for word in string.split():
        if word not in seen:
            output.append(word)
            seen.add(word)
    return ' '.join(output)

print uniquify(string)

answered Oct 17, 2011 at 13:27

ekhumoroekhumoro

114k20 gold badges226 silver badges334 bronze badges

You can use a set to keep track of already processed words.

words = set()
result = ''
for word in string1.split():
    if word not in words:
        result = result + word + ' '
        words.add(word)
print result

answered Oct 17, 2011 at 13:10

Pablo Santa CruzPablo Santa Cruz

176k32 gold badges240 silver badges292 bronze badges

Several answers are pretty close to this but haven’t quite ended up where I did:

def uniques( your_string ):    
    seen = set()
    return ' '.join( seen.add(i) or i for i in your_string.split() if i not in seen )

Of course, if you want it a tiny bit cleaner or faster, we can refactor a bit:

def uniques( your_string ):    
    words = your_string.split()

    seen = set()
    seen_add = seen.add

    def add(x):
        seen_add(x)  
        return x

    return ' '.join( add(i) for i in words if i not in seen )

I think the second version is about as performant as you can get in a small amount of code. (More code could be used to do all the work in a single scan across the input string but for most workloads, this should be sufficient.)

answered Oct 17, 2011 at 22:13

Chris PhillipsChris Phillips

11.4k3 gold badges34 silver badges45 bronze badges

Question: Remove the duplicates in a string

 from _collections import OrderedDict

    a = "Gina Gini Gini Protijayi"

    aa = OrderedDict().fromkeys(a.split())
    print(' '.join(aa))
   # output => Gina Gini Protijayi

answered Jun 16, 2018 at 23:44

Soudipta DuttaSoudipta Dutta

1,3051 gold badge12 silver badges7 bronze badges

Use numpy function
make an import its better to have an alias for the import (as np)

import numpy as np

and then you can bing it like this
for removing duplicates from array you can use it this way

no_duplicates_array = np.unique(your_array)

for your case if you want result in string you can use

no_duplicates_string = ' '.join(np.unique(your_string.split()))

answered Jun 8, 2020 at 12:04

Sulman MalikSulman Malik

1351 gold badge1 silver badge7 bronze badges

11 and 2 work perfectly:

    s="the sky is blue very blue"
    s=s.lower()
    slist = s.split()
    print " ".join(sorted(set(slist), key=slist.index))

and 2

    s="the sky is blue very blue"
    s=s.lower()
    slist = s.split()
    print " ".join(sorted(set(slist), key=slist.index))

answered Apr 17, 2016 at 16:38

You can remove duplicate or repeated words from a text file or string using following codes –

from collections import Counter
for lines in all_words:

    line=''.join(lines.lower())
    new_data1=' '.join(lemmatize_sentence(line))
    new_data2 = word_tokenize(new_data1)
    new_data3=nltk.pos_tag(new_data2)

    # below code is for removal of repeated words

    for i in range(0, len(new_data3)):
        new_data3[i] = "".join(new_data3[i])
    UniqW = Counter(new_data3)
    new_data5 = " ".join(UniqW.keys())
    print (new_data5)


    new_data.append(new_data5)


print (new_data)

P.S. -Do identations as per required.
Hope this helps!!!

answered Jun 25, 2018 at 7:22

Without using the split function (will help in interviews)

def unique_words2(a):
    words = []
    spaces = ' '
    length = len(a)
    i = 0
    while i < length:
        if a[i] not in spaces:
            word_start = i
            while i < length and a[i] not in spaces:
                i += 1
            words.append(a[word_start:i])
        i += 1
    words_stack = []
    for val in words:  #
        if val not in words_stack:  # We can replace these three lines with this one -> [words_stack.append(val) for val in words if val not in words_stack]
            words_stack.append(val)  #
    print(' '.join(words_stack))  # or return, your choice


unique_words2('calvin klein design dress calvin klein')

Taazar

1,54518 silver badges27 bronze badges

answered Mar 6, 2020 at 12:06

initializing list

listA = [ 'xy-xy', 'pq-qr', 'xp-xp-xp', 'dd-ee']

print("Given list : ",listA)

using `set()` and `split()`

res = [set(sub.split('-')) for sub in listA]

Result

print("List after duplicate removal :", res)

Peter Csala

16k15 gold badges33 silver badges73 bronze badges

answered Oct 17, 2021 at 12:39

To remove duplicate words from sentence and preserve the order of the words you can use dict.fromkeys method.

string1 = "calvin klein design dress calvin klein"

words = string1.split()

result = " ".join(list(dict.fromkeys(words)))

print(result)

answered Dec 1, 2022 at 14:24

OkroshiashviliOkroshiashvili

3,5972 gold badges26 silver badges39 bronze badges

You can do that simply by getting the set associated to the string, which is a mathematical object containing no repeated elements by definition. It suffices to join the words in the set back into a string:

def remove_duplicate_words(string):
        x = string.split()
        x = sorted(set(x), key = x.index)
        return ' '.join(x)

answered Nov 9, 2018 at 8:28

Источник

~~Du raker~~

Заблокирован

Найти повторяющиеся в тексте слова

05.12.2020, 10:40. Показов 12118. Ответов 6

Студворк — интернет-сервис помощи студентам

Здравствуйте, надо найти слова или части слов, которые встречаются в тексте несколько раз. Еще точнее, найти повторяющиеся группы символов. Как это сделать, я не знаю. Спасибо.

Gdez

Эксперт Python

7254 / 4043 / 1779

Регистрация: 27.03.2020

Сообщений: 6,869

05.12.2020, 11:28

Du raker,

Python

st = "Здравствуйте, надо найти слова или части слов, которые встречаются в тексте несколько раз. Еще точнее, найти повторяющиеся группы символов. Как это сделать, я не знаю. Спасибо."
n = len(st)
res = set()
for i in range(n-1) :
    tmp = st[i]
    for j in range(i+1,(n - i) // 2) :
        k = st.count(tmp)
        if k > 1:
            res.add((tmp,k))
        tmp += st[j]
res = sorted(list(res), key = lambda x: -x[1])
print(*res, sep = 'n')

~~Du raker~~

Заблокирован

05.12.2020, 12:28

[ТС]

Python

1	st=" Здравствуйте, здравствуй"

Вывод должен быть — дравствуй 2

Эксперт Python

7254 / 4043 / 1779

Регистрация: 27.03.2020

Сообщений: 6,869

05.12.2020, 12:44

Du raker, ?
Условие в задании – все повторы… Ни указания на регистр, ни частичное повторное – т.е. и “д” и “др” и “дра” и “драв” и тд

Рыжий Лис

Просто Лис

Эксперт Python

5087 / 3254 / 1008

Регистрация: 17.05.2012

Сообщений: 9,531

Записей в блоге: 9

05.12.2020, 13:17

Python

1
2
3

>>> st="Здравствуйте, здравствуй"
>>> re.findall(r'(w+).*(?:1)', st)
['дравствуй']

Добавлено через 40 секунд

Python

1 2	>>> re.findall(r'(w+).*1', st) ['дравствуй']

Gdez

Эксперт Python

7254 / 4043 / 1779

Регистрация: 27.03.2020

Сообщений: 6,869

05.12.2020, 13:42

Du raker,

Python

st = "Здравствуйте, надо найти слова или части слов, которые встречаются в тексте несколько раз. Еще точнее, найти повторяющиеся группы символов. Как это сделать, я не знаю. Спасибо. здравствуй"
n = len(st)
res = set()
for i in range(n - 1) :
    if not st[i].isalpha() :
        continue
    tmp = st[i]
    for j in range(i+1, n) :
        if not st[j].isalpha() :
            break
        tmp += st[j]
        k = st.count(tmp)
        if k > 1 and len(tmp) > 1 :
            res.add(tmp)            
res = sorted(list(res), key = lambda x: len(x))
tmp = []
for i in range(len(res) - 1) :
    k = 1
    for j in range(i + 1, len(res)) :
        if res[i] in res[j] :
            k = 0
            break
    if k :
        tmp.append((res[i],st.count(res[i])))
tmp.append((res[-1], st.count(res[-1])))
print(*tmp, sep = 'n')

~~Du raker~~

Заблокирован

05.12.2020, 18:56

[ТС]

Мне понятней такой способ поиска

Python

str=" Здравствуйте, здравствуй"
poisk=""
 
for smeshen in range(1, len(str)-2):
    for i in range(len(str)-smeshen):
        if str[i]==str[i+smeshen]:
            print(str[i], end=' ')
    print()

Добавлено через 1 минуту
Подскажите пожалуйста, как заполнить строку poisk, искомым.
Спасибо.

IT_Exp

Эксперт

87844 / 49110 / 22898

Регистрация: 17.06.2006

Сообщений: 92,604

05.12.2020, 18:56

Помогаю со студенческими работами здесь

Во введенном тексте найти повторяющиеся слова, осуществить их вывод на экран
Во введенном тексте найти повторяющиеся слова, осуществить их вывод на экран

В произвольном тексте найти повторяющиеся слова,определить количество повторений каждого из них
В произвольном тексте найти повторяющиеся слова,определить количество повторений каждого из них….

В тексте подсчитать повторяющиеся слова, удалить слова с удвоенными буквами
3.С клавиатуры вводится текстовую строку. Написать программу,которая подсчитывает количество разных…

Проверить, есть ли в тексте повторяющиеся слова
Проверить, есть ли в тексте повторяющиеся слова

проверить есть ли в тексте повторяющиеся слова?
Проверить, есть ли в тексте повторяющиеся слова. Предложение пользователь вводит с клавиатуры….

Проверить, есть ли в тексте повторяющиеся слова
2. Проверить, есть ли в тексте повторяющиеся слова.

Искать еще темы с ответами

Или воспользуйтесь поиском по форуму:

Источник

python строки

Ответы

Чтобы проверить есть ли в строке повторяющиесся символы можно, например, обойти строку, складывая встречающиеся символы в множество (set) и проверять, положили ли мы его раньше

text = 'Foobaar'
seen = set()
for ch in text:
    if ch in seen:
        print('Was seen before!')
        break
    else:
        seen.add(ch)
# Was seen before!

0

0

Добавьте ваш ответ

Строка Python: упражнение 55 с решением

Напишите программу на Python, чтобы найти первое повторяющееся слово в данной строке.

Пример решения : –

Код Python:

def first_repeated_word(str1):
  temp = set()
  for word in str1.split():
    if word in temp:
      return word;
    else:
      temp.add(word)
  return 'None'
print(first_repeated_word("ab ca bc ab"))
print(first_repeated_word("ab ca bc ab ca ab bc"))
print(first_repeated_word("ab ca bc ca ab bc"))
print(first_repeated_word("ab ca bc"))

Пример вывода:

 аб
аб
Калифорния
Никто

Блок – схема:

«Блок-схема:

Визуализируйте выполнение кода Python:

Следующий инструмент визуализирует, что компьютер делает шаг за шагом при выполнении указанной программы:

Редактор кода Python:

Есть другой способ решить это решение? Внесите свой код (и комментарии) через Disqus.

Предыдущий: Напишите программу на Python, чтобы найти первый повторяющийся символ данной строки, где индекс первого вхождения наименьший.
Далее: Напишите программу на Python, чтобы найти второе наиболее повторяющееся слово в данной строке.

Источник

initializing list

using `set()` and `split()`

Result

Найти повторяющиеся в тексте слова

Рекомендуемые курсы

Похожие вопросы

Строка Python: упражнение 55 с решением

Визуализируйте выполнение кода Python:

Добавить комментарий Отменить ответ

initializing list

using set() and split()

Result

Найти повторяющиеся в тексте слова

Рекомендуемые курсы

Похожие вопросы

Строка Python: упражнение 55 с решением

Визуализируйте выполнение кода Python:

Вам также может понравиться

Как найти сохраненные пароли на своем телефоне

Как найти свой комплекс упражнений

Как найти молярную массу mgso4

Добавить комментарий Отменить ответ

using `set()` and `split()`