Forums | developer.brewmp.com Forums | developer.brewmp.com

Developer

Forums

Forums:

I want to find a word by seeking one text file. Containning :
=========================================
abcdefghijklmnopqrstuvwxyz
adgjwpt1adgjwpt2adgjwpt3
=========================================

I seek file per character and read buffer with length is StringLen.

pTextBufC is pointer of char to store read text file.
pTextBuf is pointer of AECHAR to store input which's to be compared.

I can't understand about distance of seeking ..
If I want to search "ab", "mn" is OK
but failed when search "bc", "cd", and many substring ..

I guess there's something wrong with distance or witdh of input.

The partial source is below :
Searching will stop when seeking is up to the end,
or we found substring which's matching with our input.

=========================================
do
{

if(IFILE_Read(pApp->pIFile,pTextBufC,StringLen) && WSTRCMP(STRTOWSTR(STRLOWER(pTextBufC),pTextBuf,128),pApp->pTextInput)==0)
isFound=TRUE;

}while(IFILE_Seek(pApp->pIFile,_SEEK_CURRENT,1)==SUCCESS) && isFound==FALSE)
=========================================

Any one can explain what's wrong with mu code ?
or can you give me an example subrutine to find matching word from text file.

TIA

Let's use an example to see what is wrong:
imagine that StringLen = 2, and pTextBuf = "ab"
The file contains "aabbccdd"
When you start, the file is positionned at the first 'a'
IFILE_Read(pApp->pIFile,pTextBufC,StringLen)
you read "aa", which is different from "ab". Now the file is positionned at the first b
Then in your while condition :
IFILE_Seek(pApp->pIFile,_SEEK_CURRENT,1)
you move to the second b... So you skip alot of chars..
You think you can just change your seek:
IFILE_Seek(pApp->pIFile,_SEEK_CURRENT, -StringLen + 1)
BUT this is a sloooow way of searching a substring, try to use google to find a better algorithm.. Btw if the file is small, just load it first, otherwise read chunks of (chunkSize + StringLen), search in this buffer, then move to (-StringLen), and continue reading/searching..
/kUfa

Let's use an example to see what is wrong:
imagine that StringLen = 2, and pTextBuf = "ab"
The file contains "aabbccdd"
When you start, the file is positionned at the first 'a'
IFILE_Read(pApp->pIFile,pTextBufC,StringLen)
you read "aa", which is different from "ab". Now the file is positionned at the first b
Then in your while condition :
IFILE_Seek(pApp->pIFile,_SEEK_CURRENT,1)
you move to the second b... So you skip alot of chars..
You think you can just change your seek:
IFILE_Seek(pApp->pIFile,_SEEK_CURRENT, -StringLen + 1)
BUT this is a sloooow way of searching a substring, try to use google to find a better algorithm.. Btw if the file is small, just load it first, otherwise read chunks of (chunkSize + StringLen), search in this buffer, then move to (-StringLen), and continue reading/searching..
/kUfa

In BREW file reading speed can be improved by using IFILE_SetCacheSize.
Please note that BREW file read/write is cache based. When you do seek, if it causes the cache to be flushed then your algorithm will be slow. I think it would better if you use a buffer of 2K or 4K and read data into the buffer and then do your manipulation by using pointers.
ruben

In BREW file reading speed can be improved by using IFILE_SetCacheSize.
Please note that BREW file read/write is cache based. When you do seek, if it causes the cache to be flushed then your algorithm will be slow. I think it would better if you use a buffer of 2K or 4K and read data into the buffer and then do your manipulation by using pointers.
ruben

thanks a lot for this response,
I was misunderstand about IFILE_Read :)
by using IFILE_Seek(pApp->pIFile,_SEEK_CURRENT, -StringLen + 1)
I think it will be too slow, but like ruben said to use IFILE_SetCacheSize..
I dont know to implement this .. (still try to understand)
If you don't mind pls give an example to find substring in a file text efficiently...
I learn to build mini-dictionary.... just read from text files clustered by Alphabeth.
(it will be my first application after Helloworld) :)
TIA

thanks a lot for this response,
I was misunderstand about IFILE_Read :)
by using IFILE_Seek(pApp->pIFile,_SEEK_CURRENT, -StringLen + 1)
I think it will be too slow, but like ruben said to use IFILE_SetCacheSize..
I dont know to implement this .. (still try to understand)
If you don't mind pls give an example to find substring in a file text efficiently...
I learn to build mini-dictionary.... just read from text files clustered by Alphabeth.
(it will be my first application after Helloworld) :)
TIA

IFile* m_pIFile;
IFileMgr* m_pIFileMgr;
ISHELL_CreateInstance ( pThis->a.m_pIShell, AEECLSID_FILEMGR, (void**)&pThis->m_pIFileMgr )
-- To create the file manager instance
Now creat the file
m_pIFile = IFILEMGR_OpenFile (m_pIFileMgr, "textfile.dat", _OFM_CREATE);
Now open it for say, read/write
m_pIFile = IFILEMGR_OpenFile (m_pIFileMgr, "textfile.dat", _OFM_READWRITE);
Now set the file cache size
IFILE_SetCacheSize(m_pIFile, SCS_MAX);
You can change the second parameter to suite your need.
Now read data
char* blockData = (char*)MALLOC(2048);
U32 nBytesRead = IFILE_Read(m_pIFile, blockdata, 2048);
Now you have data in buffer. Using blockData pointer do the searching.
If you don't find the string, call IFILE_Readable for reading next block of data. I would not recommend to use for loop to read a file. If your file is big watchdog timer may reset your phone. Always implement readable, when you don't read any more data, don't call readable interface anymore.
ruben

IFile* m_pIFile;
IFileMgr* m_pIFileMgr;
ISHELL_CreateInstance ( pThis->a.m_pIShell, AEECLSID_FILEMGR, (void**)&pThis->m_pIFileMgr )
-- To create the file manager instance
Now creat the file
m_pIFile = IFILEMGR_OpenFile (m_pIFileMgr, "textfile.dat", _OFM_CREATE);
Now open it for say, read/write
m_pIFile = IFILEMGR_OpenFile (m_pIFileMgr, "textfile.dat", _OFM_READWRITE);
Now set the file cache size
IFILE_SetCacheSize(m_pIFile, SCS_MAX);
You can change the second parameter to suite your need.
Now read data
char* blockData = (char*)MALLOC(2048);
U32 nBytesRead = IFILE_Read(m_pIFile, blockdata, 2048);
Now you have data in buffer. Using blockData pointer do the searching.
If you don't find the string, call IFILE_Readable for reading next block of data. I would not recommend to use for loop to read a file. If your file is big watchdog timer may reset your phone. Always implement readable, when you don't read any more data, don't call readable interface anymore.
ruben

Yes, but you still need an overlap between the previous buffer and the new one to handle strings that start on one block of data and end on the next one. Either using seek, either using modifying buffers. btw the best solution imo is to handle this case separatly, so you wont need to implement any of them.
/kUfa

Yes, but you still need an overlap between the previous buffer and the new one to handle strings that start on one block of data and end on the next one. Either using seek, either using modifying buffers. btw the best solution imo is to handle this case separatly, so you wont need to implement any of them.
/kUfa

How about seek by GetLine ?
It's my dictionary format :
.....
absolution=pengampunan dosa
absolve=membebaskan diri
absorb="menyerap,meresapi,memahami,asyik"
absorbent=bahan yg dpt menyerap air
.....
to find exact meaning, i'll reach whole word only. so we can differentiate between 'absorb' and 'absorbent'.
My idea is :
1. save input word to pTextInput (length is strLen)
2. Do...While : seek per line by GetLine function
3. save the pointer to plineBuf
4. read substring which length is strLen, save to pTextBuf
5. compare pTextInput with pTextBuf
IF the words is the same
{
read one char forward
check if this is '='
IF it is '='
{
GetLine_Read by GetLine pointer that we obtain before
isFound=TRUE
}
}
6. Since it's in Do...while, it'll go to 2nd step until seeking the file is up to the last line.
==========================================
I think it'll be faster than my my algoritm before.
I still don't know to implement IGetLine...
How to realize that to BREW code ?
And also, I want to know what are representations of CR, LF, CRLF ?
(I means like null value is represented by NULL or true is by TRUE)
Thanks before for all BREW master in this forum.
I'm sorry .. I still newbie :p

How about seek by GetLine ?
It's my dictionary format :
.....
absolution=pengampunan dosa
absolve=membebaskan diri
absorb="menyerap,meresapi,memahami,asyik"
absorbent=bahan yg dpt menyerap air
.....
to find exact meaning, i'll reach whole word only. so we can differentiate between 'absorb' and 'absorbent'.
My idea is :
1. save input word to pTextInput (length is strLen)
2. Do...While : seek per line by GetLine function
3. save the pointer to plineBuf
4. read substring which length is strLen, save to pTextBuf
5. compare pTextInput with pTextBuf
IF the words is the same
{
read one char forward
check if this is '='
IF it is '='
{
GetLine_Read by GetLine pointer that we obtain before
isFound=TRUE
}
}
6. Since it's in Do...while, it'll go to 2nd step until seeking the file is up to the last line.
==========================================
I think it'll be faster than my my algoritm before.
I still don't know to implement IGetLine...
How to realize that to BREW code ?
And also, I want to know what are representations of CR, LF, CRLF ?
(I means like null value is represented by NULL or true is by TRUE)
Thanks before for all BREW master in this forum.
I'm sorry .. I still newbie :p

I found ISource, ISourceUtil, and IGetLine interfaces that related to access data line-by-line.
May be someone can help me give me explanation to use those interface ?
TIA

I found ISource, ISourceUtil, and IGetLine interfaces that related to access data line-by-line.
May be someone can help me give me explanation to use those interface ?
TIA

For this kind of pattern search, you can use variations of "Boyer and Moore" algorithm. Based on your application data set, you may be able to do some teaking to improve performance further. You can find information about that in any algorithm book.
In many BREW sdk example gives you the usage of ISource, ISourceUtils, IGetLine interfaces.
ruben

For this kind of pattern search, you can use variations of "Boyer and Moore" algorithm. Based on your application data set, you may be able to do some teaking to improve performance further. You can find information about that in any algorithm book.
In many BREW sdk example gives you the usage of ISource, ISourceUtils, IGetLine interfaces.
ruben

I have just search "Boyer and Moore" tweaking search algoritm...
:D
simple and may be appropriate for limited environment like HP.
I'll try it after I finished my first application (just using conventional sequential search)...

I have just search "Boyer and Moore" tweaking search algoritm...
:D
simple and may be appropriate for limited environment like HP.
I'll try it after I finished my first application (just using conventional sequential search)...

My dictionary has been finished, just using linier sequential search :D
I hope It'll be faster :confused: :confused:
"Boyer and Moore" may be effective while we searching long substring.
But I'll try this.
It's nice playing with searching... good experiences :D
TIA

My dictionary has been finished, just using linier sequential search :D
I hope It'll be faster :confused: :confused:
"Boyer and Moore" may be effective while we searching long substring.
But I'll try this.
It's nice playing with searching... good experiences :D
TIA