Friday, March 23, 2012

Listing words with no of occurances in sql server

Hi,
Is it possible to get words from text columns in a sql server database in the order of their occurances in that column with full text search or by any other method.
Thanks in advance.try this approach -- Simple Keyword Relevance (http://r937.com/keyword_relevance.html)|||:cool:

My query is regarding retreiving words from table on the bases of its maximum occurances in column irrespective of a row.

means a word that appears maximum number of times in a column, suppose if i have 10 rows in a table and word "Globe" appears maximum no of times say in 5 rows i want Globe word on the top.

Thanks in advance.|||select wordcolumn
, count(*) as occurrences
from daTable
group
by wordcolumn
order
by occurrences desc|||Hi,
Thanks for quick reply...

This could solve my problem if reffered column contain only single word at a time.
But this column is varchar column containing text of multiple words.

Any suggestion?

Thanks|||ok, that's going to require that you split out every word in the column individually

this may require a cross join with a numbers table

what's this for? how often are you going to use it?

the query will definitely not be easy to write, nor efficient to run|||Please read the sticky at the top of the board and post what it asks for...we need some examples|||you gonna write the query, brett?

here's an example --create table somewords
( id integer not null primary key identity
, blah text not null
);
insert into somewords (blah)
values ('a word that appears maximum number of times in a column')
insert into somewords (blah)
values ('Is it possible to get words from text columns in a sql server database')
insert into somewords (blah)
values ('This could solve my problem if reffered column contain only single word')
insert into somewords (blah)
values ('that''s going to require that you split out every word in the column individually')
insert into somewords (blah)
values ('the query will definitely not be easy to write')
insert into somewords (blah)
values ('Please read the sticky at the top of the board')
insert into somewords (blah)
values ('The physical order of data in a database has no meaning')i'm sure you understand the requirement -- which word occurs the most frequent?|||Hi,

r937 quoted the right example and i am sure he understood the problem.

I need it often in my application.

If you say to split words in multiple columns, the problem here is we are not sure about maximum number of words in field.|||Is this for doing a search and ranking the results based on multiple search criteria...if it is then Rudy did not understan...because what you are asking makes no sense

Is it really show me for all rows in a table, the rows with the most occuring types?

If it is, and you need to "do that alot", what in heavens name for?|||Oh, and why would you need a query

http://www.google.com/search?hl=en&lr=&safe=off&q=a&btnG=Search

About 8 billion hits

I think that would "win"|||the question is really very simple, brett

refer to the sample data which i posted (which you wanted to see), and answer the question: which word occurs most frequently in the blah column? hint: it isn't the word "data" and it isn't the word "word"|||OK, well, you would have to start be decontructig the data...but I still don't know what for or why there would be a need for this

create table somewords
( id integer not null primary key identity
, blah text not null
);
insert into somewords (blah)
values ('a word that appears maximum number of times in a column')
insert into somewords (blah)
values ('Is it possible to get words from text columns in a sql server database')
insert into somewords (blah)
values ('This could solve my problem if reffered column contain only single word')
insert into somewords (blah)
values ('that''s going to require that you split out every word in the column individually')
insert into somewords (blah)
values ('the query will definitely not be easy to write')
insert into somewords (blah)
values ('Please read the sticky at the top of the board')
insert into somewords (blah)
values ('The physical order of data in a database has no meaning')
GO

CREATE TABLE UniqueWords (
Word varchar(256)
, WordId int IDENTITY(1,1)
, Add_Dt datetime DEFAULT (GetDate()))
GO

CREATE UNIQUE INDEX UnqueWords_PK ON UniqueWords(Word)
GO

CREATE PROC isp_INS_UNIQUE_WORDS
AS
BEGIN
SET NOCOUNT ON
DECLARE @.Words INT, @.Pos INT, @.x Int, @.str varchar(256)
, @.word varchar(256), @.start int, @.end int, @.exitstart int
SELECT @.Words = 0, @.Pos = 1, @.x = -1, @.Word = '', @.start = 1

DECLARE myCursor CURSOR FOR SELECT Blah FROM SomeWords
OPEN myCursor
FETCH NEXT FROM myCursor INTO @.str

WHILE @.@.FETCH_STATUS = 0
BEGIN
WHILE (@.x <> 0)
BEGIN
SET @.x = CHARINDEX(' ', @.str, @.Pos)
IF @.x <> 0
BEGIN
SET @.end = @.x - @.start
SET @.word = SUBSTRING(@.str,@.start,@.end)
IF NOT EXISTS (SELECT * FROM UniqueWords WHERE Word = @.Word)
INSERT INTO UniqueWords(Word) SELECT @.word
-- SELECT @.Word, @.@.ROWCOUNT,@.@.ERROR
-- SELECT @.x, @.Word, @.start, @.end, @.str
SET @.exitstart = @.start + @.end + 1
SET @.Pos = @.x + 1
SET @.start = @.x + 1
SET @.Words = @.Words + 1
END
IF @.x = 0
BEGIN
SET @.word = SUBSTRING(@.str,@.exitstart,LEN(@.str)-@.exitstart+1)
IF NOT EXISTS (SELECT * FROM UniqueWords WHERE Word = @.Word)
INSERT INTO UniqueWords(Word) SELECT @.word
-- SELECT @.Word, @.@.ROWCOUNT,@.@.ERROR
-- SELECT @.x, @.Word, @.exitstart, LEN(@.str)-@.exitstart, @.str
END
END
FETCH NEXT FROM myCursor INTO @.str
SELECT @.Words = 0, @.Pos = 1, @.x = -1, @.Word = '', @.start = 1
END

CLOSE myCursor
DEALLOCATE myCursor
SET NOCOUNT OFF
RETURN @.Words
END
GO

EXEC isp_INS_UNIQUE_WORDS
GO

SELECT * FROM UniqueWords ORDER BY Word
GO

DROP PROC isp_INS_UNIQUE_WORDS
DROP TABLE UniqueWords, somewords
GO

I'm gonna blog this|||...but I still don't know what for or why there would be a need for this

You don't think there's a use for data mining on simple text?

let's see...

military intel
or
political speech analysis
or
most occurences in scientific notation
...|||brett! dude!! that's wonderful! and the cursors, so pretty!!

now, any chance you could actually address the problem as initially posted?

which word occurred the most often?

boy it's taking you a long time to catch on

:)|||boy it's taking you a long time to catch on

:)

Well, I'll have to step back aand rethink this then

I thought you would need a list of unique words in another table, then perhaps join to it...and something else...don't know...thought the deconstructing thing was a fun puzzle, of which I would have no other way to do that

I was think of something like

...oh jeez, I don't know what I was thinking|||well...I guess I don't know what I'm talking about...especially sice the req is not very clear

No comments:

Post a Comment