Forums | developer.brewmp.com Forums | developer.brewmp.com

Developer

Forums

Forums:

I have a line of code that continually crashes the T-720, but works fine on the simulator. I have yet to test on another phone.

the code looks like this...

void foo(uint32 total, byte slices){
uint32 sumPerPlayer;
DBGPRINTF("total = %d, slices = %d", total, slices); //this works as expected

sumPerPlayer = total / (uint32)slices; //this line causes a crash

with or without the cast to uint32 the program crashes at the line where the division is performed. It has no trouble assigning the value (if I were to do sumPerPlayer = total) but when I try to divide it crashes. I tried changing the type of slices to uint32 but that didn't help.

I'm totally at a loss as to why this line would crash the phone. I tried using it at a different point in the program and it worked fine, but at the point in the code where it is, it causes a crash every time.

Has anyone seen anything like this?

Your crash has absolutely nothing to do with the uint32 datatype, which is simply a 32-bit integer, native to both, Pentium and ARM chips.
I would suspect you're running into a division by zero error, plain and simple.

Your crash has absolutely nothing to do with the uint32 datatype, which is simply a 32-bit integer, native to both, Pentium and ARM chips.
I would suspect you're running into a division by zero error, plain and simple.

if that's not zero, maybe pop that line into somewhere at the beginning so you know that nothing else mucked up the phone.
Dragon wrote:Your crash has absolutely nothing to do with the uint32 datatype, which is simply a 32-bit integer, native to both, Pentium and ARM chips.
I would suspect you're running into a division by zero error, plain and simple.

if that's not zero, maybe pop that line into somewhere at the beginning so you know that nothing else mucked up the phone.
Dragon wrote:Your crash has absolutely nothing to do with the uint32 datatype, which is simply a 32-bit integer, native to both, Pentium and ARM chips.
I would suspect you're running into a division by zero error, plain and simple.

it's definitely not a /0 and it definitely works elsewhere in the code.
This is happening in the midst of a fairly complex loop, but I'm 100% positive that THIS IS the line that it crashes on. The line (by itself, in another less hectic location in the code) works fine, but at this point, the phone crashes. NOTE that the simulator does NOT crash... in fact, the simulator prints out all the correct digits all around the thing. When this line is removed, the phone doesn't crash either and the phone prints the line directly above the code and then crashes (failing to print the line directly beneath it), so, I have absolutely no doubt that this is, indeed, the line that's causing the trouble.
I was really hoping that someone would say something like "Oh yeah... T-720 has a real tough time dividing uint32s... here's a workaround."
I guess I can't always be that lucky.

it's definitely not a /0 and it definitely works elsewhere in the code.
This is happening in the midst of a fairly complex loop, but I'm 100% positive that THIS IS the line that it crashes on. The line (by itself, in another less hectic location in the code) works fine, but at this point, the phone crashes. NOTE that the simulator does NOT crash... in fact, the simulator prints out all the correct digits all around the thing. When this line is removed, the phone doesn't crash either and the phone prints the line directly above the code and then crashes (failing to print the line directly beneath it), so, I have absolutely no doubt that this is, indeed, the line that's causing the trouble.
I was really hoping that someone would say something like "Oh yeah... T-720 has a real tough time dividing uint32s... here's a workaround."
I guess I can't always be that lucky.

Maybe the watchdog? Since you say it's a complex loop, maybe it's taking too much time?

Maybe the watchdog? Since you say it's a complex loop, maybe it's taking too much time?

might be out of stack, the divide calls a routine that uses some stack, the equal doesn't

might be out of stack, the divide calls a routine that uses some stack, the equal doesn't

Dragon, I was under the impression that the watchdog gives your program upwards of a minute (60 seconds) before it gets angry. This loop is complex by human standards, but the phone (and simulator) deal with it in less than a second.
Ok... I'm going to say this and you guys will say that I'm lying or that I'm crazy... and I would say so too, if this wasn't happening to me. I think next I'll be abducted by a UFO...
I tried leaving the stack and doing the division in the HEAP (malloced my self all the variables and did the division... no help.
So... then I started replacing the variables with actual numbers... so, rather than doing... C = A / B;
I did the equivalent of this...
uint32 C;
uint32 A = 40000;
byte B = 4;
C = 40000/4;
Works like a dream (no crash).
Then I did
C = A / 4;
Still no problem.
Finally.... C = 40000 / B ;
THIS crashes the phone. Every Time.
So... here is my solution (be ready... it's REALLY ugly).
Because the byte value in this case will never be more than 10 I put in a switch that does the following...
switch (B){
case 1 : C = A;break;
case 2 : C = A/2 ;break;
case 3 : C = A / 3;break;
case 4 : C = A / 4;break;
...etc... etc... to 10

I know... it's totally F***ing lame... but I SWEAR TO GOD this fixed the problem!!!!
I am very anxious to test this on another model to see if this is T-720 quirk or a BREW quirk... also, I'm a little worried about True Brew... do they look through your code for ugly nonsense like this and send it back for repair if they don't like how you have implemented something? I know I would send this back in a minute if I saw something like this in code I was "verifying" but for christ's sake, what choice did I have? I mean... it wouldn't let me divide!!!
(I even toyed with the notion of multiplying by an inverse fraction, but I figured it probably handles floats even worse than it does division... and abandoned that thought.
If I wasn't under a confidentiality contract I would gladly send anyone who's curious the VC++ project and welcome input as to how this can POSSIBLY be happening.
My only thought is that the processor handles hard coded numbers differently than numbers stored in variable... I've never heard of such a thing, but I'm not as experienced with memory management and low level bit manipulation as some of you guys, so perhaps someone might notice something here that can explain why
byte A = 3;
C = B/ A; // causes the phone to crash
and
C= B/3; // does not.

Dragon, I was under the impression that the watchdog gives your program upwards of a minute (60 seconds) before it gets angry. This loop is complex by human standards, but the phone (and simulator) deal with it in less than a second.
Ok... I'm going to say this and you guys will say that I'm lying or that I'm crazy... and I would say so too, if this wasn't happening to me. I think next I'll be abducted by a UFO...
I tried leaving the stack and doing the division in the HEAP (malloced my self all the variables and did the division... no help.
So... then I started replacing the variables with actual numbers... so, rather than doing... C = A / B;
I did the equivalent of this...
uint32 C;
uint32 A = 40000;
byte B = 4;
C = 40000/4;
Works like a dream (no crash).
Then I did
C = A / 4;
Still no problem.
Finally.... C = 40000 / B ;
THIS crashes the phone. Every Time.
So... here is my solution (be ready... it's REALLY ugly).
Because the byte value in this case will never be more than 10 I put in a switch that does the following...
switch (B){
case 1 : C = A;break;
case 2 : C = A/2 ;break;
case 3 : C = A / 3;break;
case 4 : C = A / 4;break;
...etc... etc... to 10

I know... it's totally F***ing lame... but I SWEAR TO GOD this fixed the problem!!!!
I am very anxious to test this on another model to see if this is T-720 quirk or a BREW quirk... also, I'm a little worried about True Brew... do they look through your code for ugly nonsense like this and send it back for repair if they don't like how you have implemented something? I know I would send this back in a minute if I saw something like this in code I was "verifying" but for christ's sake, what choice did I have? I mean... it wouldn't let me divide!!!
(I even toyed with the notion of multiplying by an inverse fraction, but I figured it probably handles floats even worse than it does division... and abandoned that thought.
If I wasn't under a confidentiality contract I would gladly send anyone who's curious the VC++ project and welcome input as to how this can POSSIBLY be happening.
My only thought is that the processor handles hard coded numbers differently than numbers stored in variable... I've never heard of such a thing, but I'm not as experienced with memory management and low level bit manipulation as some of you guys, so perhaps someone might notice something here that can explain why
byte A = 3;
C = B/ A; // causes the phone to crash
and
C= B/3; // does not.

can you strip down a test case and either post it or email it ?
One where the code still crashes and is a whole program ? a snippet most likely won't do since we all do division and it works fine.
difference between an immediate and an auto var, seems odd, my guess is
something else is going on
charlie

can you strip down a test case and either post it or email it ?
One where the code still crashes and is a whole program ? a snippet most likely won't do since we all do division and it works fine.
difference between an immediate and an auto var, seems odd, my guess is
something else is going on
charlie

I agree with Charlie. I do not believe this problem is related to the immediate or auto var. I have NEVER heard of anything like it, and like Charlie I've done A LOT of programming throughout the years.
I also do not believe you have fixed the problem. Your solution most likely just cloaks the real problem there is. While it may be working you may be better advised to seek out the real culprit because otherwise it may spring up again at some other place where it may be more malign and you may not be able to reproduce it as easily as in your current test case.
Post some more code, if you may, and also I would recommend looking at the assembly output of the compiler for your ARM code.

I agree with Charlie. I do not believe this problem is related to the immediate or auto var. I have NEVER heard of anything like it, and like Charlie I've done A LOT of programming throughout the years.
I also do not believe you have fixed the problem. Your solution most likely just cloaks the real problem there is. While it may be working you may be better advised to seek out the real culprit because otherwise it may spring up again at some other place where it may be more malign and you may not be able to reproduce it as easily as in your current test case.
Post some more code, if you may, and also I would recommend looking at the assembly output of the compiler for your ARM code.

Oh, and as for the watchdog, the actual time depends on the handset. some of them are extremely picky and a second may be sufficient to trigger it on handsets like the Samsung a530.
What handset are you trying this one, by the way?

Oh, and as for the watchdog, the actual time depends on the handset. some of them are extremely picky and a second may be sufficient to trigger it on handsets like the Samsung a530.
What handset are you trying this one, by the way?

Dragon,
I'm on the T-720. I too am not a novice programmer and would not say that I am 100% sure that the problem is in that line of code unless I was 100% sure.
I am JUST AS RELUCTANT to believe that the trouble is in that line as you are... it is completely rediculous and yet, there can be no doubt that the program crashes at the exact line. I could post more code, but it would not shed any extra light on a problem that behaves thusly...
DBGPRINTF("about to divide %d by %d", A, B);//this line prints, both variables report the expected numbers
C = A/B;
DBGPRINTF("result was %d", C);// this line DOES NOT PRINT, and the program has
//crashed.
Don't get me wrong, I TOTALLY understand your reluctance to accept this as a realistic behavior, but this is what is happening. AND the replacement of B with a number relieves the problem.
Again, I am NOT happy with my solution, but it HAS made this particular problem go away. All the variables involved are local stack vars that are discarded after use here, so I can't really see how it might raise its ugly head elsewhere, but I agree that this is a possibility since the whole phenomenon is completely bizarre.

Dragon,
I'm on the T-720. I too am not a novice programmer and would not say that I am 100% sure that the problem is in that line of code unless I was 100% sure.
I am JUST AS RELUCTANT to believe that the trouble is in that line as you are... it is completely rediculous and yet, there can be no doubt that the program crashes at the exact line. I could post more code, but it would not shed any extra light on a problem that behaves thusly...
DBGPRINTF("about to divide %d by %d", A, B);//this line prints, both variables report the expected numbers
C = A/B;
DBGPRINTF("result was %d", C);// this line DOES NOT PRINT, and the program has
//crashed.
Don't get me wrong, I TOTALLY understand your reluctance to accept this as a realistic behavior, but this is what is happening. AND the replacement of B with a number relieves the problem.
Again, I am NOT happy with my solution, but it HAS made this particular problem go away. All the variables involved are local stack vars that are discarded after use here, so I can't really see how it might raise its ugly head elsewhere, but I agree that this is a possibility since the whole phenomenon is completely bizarre.

You've been very thorough with your examination of just where the bug occurs, so I doubt I'll be able to help, but purely to satisfy my curiosity, could you give something like this a try?
void foo(uint32 total, byte slices){
uint32 sumPerPlayer;
uint32 slicesAsInt = (uint32)slices;
DBGPRINTF("total = %d, slices = %d", total, slices); //this works as expected
if (slicesAsInt != 0)
{
sumPerPlayer = total / slices; //this line causes a crash
}
else
{
DBGPRINTF("\nPhew - that was close!");
}
}
(The code is untested and may contain typos, but you can see what I'm getting at :))
Cheers,
Simon
P.S.
Also, are we certain that a DBGPRINTF that occurs immediately before a handset crash would execute as required? It could be that the code crashes at some other point, and some DBGPRINTFs that should have been displayed get lost (I'm not sure whether logging is synchronous or not). Again, I'm not saying that this is the case at all; I'm just throwing out wild hypotheses as we all do when we're stumped by an unexplainable bug ;)

You've been very thorough with your examination of just where the bug occurs, so I doubt I'll be able to help, but purely to satisfy my curiosity, could you give something like this a try?
void foo(uint32 total, byte slices){
uint32 sumPerPlayer;
uint32 slicesAsInt = (uint32)slices;
DBGPRINTF("total = %d, slices = %d", total, slices); //this works as expected
if (slicesAsInt != 0)
{
sumPerPlayer = total / slices; //this line causes a crash
}
else
{
DBGPRINTF("\nPhew - that was close!");
}
}
(The code is untested and may contain typos, but you can see what I'm getting at :))
Cheers,
Simon
P.S.
Also, are we certain that a DBGPRINTF that occurs immediately before a handset crash would execute as required? It could be that the code crashes at some other point, and some DBGPRINTFs that should have been displayed get lost (I'm not sure whether logging is synchronous or not). Again, I'm not saying that this is the case at all; I'm just throwing out wild hypotheses as we all do when we're stumped by an unexplainable bug ;)

Simon,
thanks... yeah, I definitely tried the above (as well as every conceivable variation on the above) and the phone crashed every time that division line was in there.
Note, that it had no trouble when that code was run elsewhere in the program. For some reason, there's obviously something else that's CAUSING the problem, but the phone does crash at that line EVERY time.
As for asinchronous printing, I was hoping that that was the case, but the sad fact is that when the line is omitted or replaced with a division in which I use a hard coded number (instead of that blasted byte) everything works smoothly and the game behaves as expected.
So... something I'm doing... be it looping or memory usage or lord knows what, is causing the T-720 to crash at this totally innocuous seeming point in the program.
I am anxiously awaiting the arrival of my LG 4600 back from Qualcomm. I will see if it behaves in a similar manner. Also, as I said, this division is not causing the simulator any grief, but as we all know, that means almost nothing. However, the simulator is fairly accurate in reporting memory buffer overflows etc, so I doubt that it's something like that.
Anyway... I appreciate everyone suffering with me... it's truly wierd.
When (god willing) this game is published in a month or two, I will gladly send the zipped project to anyone who has a T-720 and wants to give it a whirl and let me know what they found.

Simon,
thanks... yeah, I definitely tried the above (as well as every conceivable variation on the above) and the phone crashed every time that division line was in there.
Note, that it had no trouble when that code was run elsewhere in the program. For some reason, there's obviously something else that's CAUSING the problem, but the phone does crash at that line EVERY time.
As for asinchronous printing, I was hoping that that was the case, but the sad fact is that when the line is omitted or replaced with a division in which I use a hard coded number (instead of that blasted byte) everything works smoothly and the game behaves as expected.
So... something I'm doing... be it looping or memory usage or lord knows what, is causing the T-720 to crash at this totally innocuous seeming point in the program.
I am anxiously awaiting the arrival of my LG 4600 back from Qualcomm. I will see if it behaves in a similar manner. Also, as I said, this division is not causing the simulator any grief, but as we all know, that means almost nothing. However, the simulator is fairly accurate in reporting memory buffer overflows etc, so I doubt that it's something like that.
Anyway... I appreciate everyone suffering with me... it's truly wierd.
When (god willing) this game is published in a month or two, I will gladly send the zipped project to anyone who has a T-720 and wants to give it a whirl and let me know what they found.

Are you doing any string operations before calling the funciton. I had faced similar problems and finally i discovered that the crash was because i was copying more characters in a string then the actual allocated size. ( This was before the function call in some other fuction )
The other reason could be changing the address of a pointer by your applicaiton code.
Just confirm that you are not doing the same mistake that i was doing.
regards
Unmesh

Are you doing any string operations before calling the funciton. I had faced similar problems and finally i discovered that the crash was because i was copying more characters in a string then the actual allocated size. ( This was before the function call in some other fuction )
The other reason could be changing the address of a pointer by your applicaiton code.
Just confirm that you are not doing the same mistake that i was doing.
regards
Unmesh

no string operations. And if there was a problem with a string or some such, then I don't see how modifying C = A/B; to C = A/4; would help it. But I did have a similar problem a while back, so I made sure that it wasn't some odd variation of that.
As for altering the address of my pointers... what do you mean by that? In this case, I have no pointers, just values... so, I'm not sure what you're suggesting.

no string operations. And if there was a problem with a string or some such, then I don't see how modifying C = A/B; to C = A/4; would help it. But I did have a similar problem a while back, so I made sure that it wasn't some odd variation of that.
As for altering the address of my pointers... what do you mean by that? In this case, I have no pointers, just values... so, I'm not sure what you're suggesting.

Dr.Dre'del wrote:
thanks... yeah, I definitely tried the above (as well as every conceivable variation on the above) and the phone crashed every time that division line was in there.
Fascinating (although probably more "frustrating and despair-inducing" to you :( ). Ok - is the crash deterministic i.e. does it always crash at exactly the same stage of the program execution i.e. does it always crash on the nth call to this function, or do you sometimes manage to survive some arbitrary number of calls to this function? Is there any pattern to the printed debugging info during a run of your program? I'm also usually reluctant to attribute crashes to handset issues (unless it's a bug occurring in something I'm working on, of course ;)), but I'm really coming round to your point of view, here; I can think of no good logical reason for it to happen. Very odd indeed.
There is a certain class of intermittent bugs that will occur that are dependent on the structure of your program code: almost invariably due to trampling over areas of memory that you shouldn't be. In these (extremely rare) situations, altering code in some entirely unrelated place can mean the difference between a crash and a clean run, and the effect of any of these changes always appears arbitrary and unpredictable. I'm not seriously proposing that this is the case here, but I'd echo the advice to try and par your code down to the smallest, simplest test case that will reliably crash and examine it. Maybe it will contain sufficiently little proprietary code that you will be able to post it here so that we can test it on some other handsets...
Good luck, and let us know how it all works out!
Cheers,
Simon
Edit:
Maybe separating your "complex loop" so that only a portion is executed per tick might be a good idea, also....
Edit2:
Actually, the fact that it works when you use the more complex and resource-intensive "case"-type work-around bamboozles me so much that I really can't think of any useful suggestions at all. Sorry!
Edit3:
Looks like you have answered some of the questions I ask here earlier in the thread. I think you can safely ignore this post :o

Dr.Dre'del wrote:
thanks... yeah, I definitely tried the above (as well as every conceivable variation on the above) and the phone crashed every time that division line was in there.
Fascinating (although probably more "frustrating and despair-inducing" to you :( ). Ok - is the crash deterministic i.e. does it always crash at exactly the same stage of the program execution i.e. does it always crash on the nth call to this function, or do you sometimes manage to survive some arbitrary number of calls to this function? Is there any pattern to the printed debugging info during a run of your program? I'm also usually reluctant to attribute crashes to handset issues (unless it's a bug occurring in something I'm working on, of course ;)), but I'm really coming round to your point of view, here; I can think of no good logical reason for it to happen. Very odd indeed.
There is a certain class of intermittent bugs that will occur that are dependent on the structure of your program code: almost invariably due to trampling over areas of memory that you shouldn't be. In these (extremely rare) situations, altering code in some entirely unrelated place can mean the difference between a crash and a clean run, and the effect of any of these changes always appears arbitrary and unpredictable. I'm not seriously proposing that this is the case here, but I'd echo the advice to try and par your code down to the smallest, simplest test case that will reliably crash and examine it. Maybe it will contain sufficiently little proprietary code that you will be able to post it here so that we can test it on some other handsets...
Good luck, and let us know how it all works out!
Cheers,
Simon
Edit:
Maybe separating your "complex loop" so that only a portion is executed per tick might be a good idea, also....
Edit2:
Actually, the fact that it works when you use the more complex and resource-intensive "case"-type work-around bamboozles me so much that I really can't think of any useful suggestions at all. Sorry!
Edit3:
Looks like you have answered some of the questions I ask here earlier in the thread. I think you can safely ignore this post :o

You must have messed up the mem somewhere else....dig through the code paths that lead to that division and see if you somehow mucked up the mem or accessed dangling pointers and such... :D
Happened to me once, only that the crash was extremely infrequent!
'Dr.Dre'del wrote:Simon,
thanks... yeah, I definitely tried the above (as well as every conceivable variation on the above) and the phone crashed every time that division line was in there.
Note, that it had no trouble when that code was run elsewhere in the program. For some reason, there's obviously something else that's CAUSING the problem, but the phone does crash at that line EVERY time.
As for asinchronous printing, I was hoping that that was the case, but the sad fact is that when the line is omitted or replaced with a division in which I use a hard coded number (instead of that blasted byte) everything works smoothly and the game behaves as expected.
So... something I'm doing... be it looping or memory usage or lord knows what, is causing the T-720 to crash at this totally innocuous seeming point in the program.
I am anxiously awaiting the arrival of my LG 4600 back from Qualcomm. I will see if it behaves in a similar manner. Also, as I said, this division is not causing the simulator any grief, but as we all know, that means almost nothing. However, the simulator is fairly accurate in reporting memory buffer overflows etc, so I doubt that it's something like that.
Anyway... I appreciate everyone suffering with me... it's truly wierd.
When (god willing) this game is published in a month or two, I will gladly send the zipped project to anyone who has a T-720 and wants to give it a whirl and let me know what they found.

You must have messed up the mem somewhere else....dig through the code paths that lead to that division and see if you somehow mucked up the mem or accessed dangling pointers and such... :D
Happened to me once, only that the crash was extremely infrequent!
'Dr.Dre'del wrote:Simon,
thanks... yeah, I definitely tried the above (as well as every conceivable variation on the above) and the phone crashed every time that division line was in there.
Note, that it had no trouble when that code was run elsewhere in the program. For some reason, there's obviously something else that's CAUSING the problem, but the phone does crash at that line EVERY time.
As for asinchronous printing, I was hoping that that was the case, but the sad fact is that when the line is omitted or replaced with a division in which I use a hard coded number (instead of that blasted byte) everything works smoothly and the game behaves as expected.
So... something I'm doing... be it looping or memory usage or lord knows what, is causing the T-720 to crash at this totally innocuous seeming point in the program.
I am anxiously awaiting the arrival of my LG 4600 back from Qualcomm. I will see if it behaves in a similar manner. Also, as I said, this division is not causing the simulator any grief, but as we all know, that means almost nothing. However, the simulator is fairly accurate in reporting memory buffer overflows etc, so I doubt that it's something like that.
Anyway... I appreciate everyone suffering with me... it's truly wierd.
When (god willing) this game is published in a month or two, I will gladly send the zipped project to anyone who has a T-720 and wants to give it a whirl and let me know what they found.

well only in that if the string operation overwrites, it could mess up the stack or something.
the division will cause it to take slightly longer, what about B *D instead of / ?
the simulator isn't always good for heap/stack stuff either, since windows lets you get away with a lot, and the ARM model is totally different stackwise too, the VC version of the function may not even use a stack.
even so i'd lay money on that the crash at the division is the problem rearing, not the root cause.
what were the results of
putting the division before the function call
changing it to a *
and putting the division after the function call

well only in that if the string operation overwrites, it could mess up the stack or something.
the division will cause it to take slightly longer, what about B *D instead of / ?
the simulator isn't always good for heap/stack stuff either, since windows lets you get away with a lot, and the ARM model is totally different stackwise too, the VC version of the function may not even use a stack.
even so i'd lay money on that the crash at the division is the problem rearing, not the root cause.
what were the results of
putting the division before the function call
changing it to a *
and putting the division after the function call

I agree with all of you that the problem is not the division itself. That would make no sense, and besides, as I said, the same exact division, when inserted into some arbitrary other location in the code (for test sake) works without a hitch.
The problem is that this particular line of code is unavoidable, and, unfortunately, it happens at the end of a fairly convoluted series of tests which I don't have the time to... I'm not sure what the word would be... strip down? I don't know... I'm sure you know what I mean.
In answer to a previous question, the one thing that is 100% reliable is that as it was (before my oddball switch workaround) the code DID crash at this spot every single time. Furthermore, in spite of the fact that the game uses a complex set of conditionals to get to this point, this point is inevitable, however, it's impossible now, to drive the game in a specific way TO this point because there is so much logic and a fair amount of random behavior that brings us (the code) here.
After the project is finished, I will definitely go back to see what I can strip out and find out what may have been causing this. I will post back to this thread when I do, so if you're getting e-mails when new posts arrive here, you can take a look.
Thank you all again for looking at this with me, I really appreciate it.
Edit 1:
It is SOOO much more enjoyable to chat with you folks about this problem than sit here crashing my phone, rebooting my computer and ripping hair out of my head.
Even if no one here offers me a path to a solution, I would still much prefer to just shoot the sh*t in reference to this problem than actually spend any more time hacking at it. In all my years of programming I have to admit, I have never EVER seen a bug like this before.

I agree with all of you that the problem is not the division itself. That would make no sense, and besides, as I said, the same exact division, when inserted into some arbitrary other location in the code (for test sake) works without a hitch.
The problem is that this particular line of code is unavoidable, and, unfortunately, it happens at the end of a fairly convoluted series of tests which I don't have the time to... I'm not sure what the word would be... strip down? I don't know... I'm sure you know what I mean.
In answer to a previous question, the one thing that is 100% reliable is that as it was (before my oddball switch workaround) the code DID crash at this spot every single time. Furthermore, in spite of the fact that the game uses a complex set of conditionals to get to this point, this point is inevitable, however, it's impossible now, to drive the game in a specific way TO this point because there is so much logic and a fair amount of random behavior that brings us (the code) here.
After the project is finished, I will definitely go back to see what I can strip out and find out what may have been causing this. I will post back to this thread when I do, so if you're getting e-mails when new posts arrive here, you can take a look.
Thank you all again for looking at this with me, I really appreciate it.
Edit 1:
It is SOOO much more enjoyable to chat with you folks about this problem than sit here crashing my phone, rebooting my computer and ripping hair out of my head.
Even if no one here offers me a path to a solution, I would still much prefer to just shoot the sh*t in reference to this problem than actually spend any more time hacking at it. In all my years of programming I have to admit, I have never EVER seen a bug like this before.

Wouldn't it be a blessing in cases such as this to have on-device debugging? Ah, the wonderful world of BREW.... and that reminds me... time to change my sig...

Wouldn't it be a blessing in cases such as this to have on-device debugging? Ah, the wonderful world of BREW.... and that reminds me... time to change my sig...

I curious how they wrote Brew itself? Is it possible that the OS was written sans on-device debugging? I can't imagine this to be true, either they must have a true Emulator (note the capital E) that isn't just a rough approximation of the hardware, but rather, a circuit for circuit image of SOME phone... or they have an on device debugger. I can't... I won't... believe that they would release an OS for millions of phones having never run it through anything more rigorous than the Logger.

I curious how they wrote Brew itself? Is it possible that the OS was written sans on-device debugging? I can't imagine this to be true, either they must have a true Emulator (note the capital E) that isn't just a rough approximation of the hardware, but rather, a circuit for circuit image of SOME phone... or they have an on device debugger. I can't... I won't... believe that they would release an OS for millions of phones having never run it through anything more rigorous than the Logger.

Your oddball solution will cause a crash right at the moment you least want a crash...like right when a carrier is about to sign off your app and their QA just want to run a quickie final test....Or it may only crash when the QA wanted to show your latest and greatest to the visiting VP :D
Crashes never go away until you have truly found the cause and fixed it.
On actual handsets, I have had code issue that should have caused the phones to halt immediately, but no, the phones would continue happily until a very in-opportune time, then it would crash. :mad:
Dr.Dre'del wrote:I agree with all of you that the problem is not the division itself. That would make no sense, and besides, as I said, the same exact division, when inserted into some arbitrary other location in the code (for test sake) works without a hitch.
The problem is that this particular line of code is unavoidable, and, unfortunately, it happens at the end of a fairly convoluted series of tests which I don't have the time to... I'm not sure what the word would be... strip down? I don't know... I'm sure you know what I mean.
In answer to a previous question, the one thing that is 100% reliable is that as it was (before my oddball switch workaround) the code DID crash at this spot every single time. Furthermore, in spite of the fact that the game uses a complex set of conditionals to get to this point, this point is inevitable, however, it's impossible now, to drive the game in a specific way TO this point because there is so much logic and a fair amount of random behavior that brings us (the code) here.
After the project is finished, I will definitely go back to see what I can strip out and find out what may have been causing this. I will post back to this thread when I do, so if you're getting e-mails when new posts arrive here, you can take a look.
Thank you all again for looking at this with me, I really appreciate it.
Edit 1:
It is SOOO much more enjoyable to chat with you folks about this problem than sit here crashing my phone, rebooting my computer and ripping hair out of my head.
Even if no one here offers me a path to a solution, I would still much prefer to just shoot the sh*t in reference to this problem than actually spend any more time hacking at it. In all my years of programming I have to admit, I have never EVER seen a bug like this before.

Your oddball solution will cause a crash right at the moment you least want a crash...like right when a carrier is about to sign off your app and their QA just want to run a quickie final test....Or it may only crash when the QA wanted to show your latest and greatest to the visiting VP :D
Crashes never go away until you have truly found the cause and fixed it.
On actual handsets, I have had code issue that should have caused the phones to halt immediately, but no, the phones would continue happily until a very in-opportune time, then it would crash. :mad:
Dr.Dre'del wrote:I agree with all of you that the problem is not the division itself. That would make no sense, and besides, as I said, the same exact division, when inserted into some arbitrary other location in the code (for test sake) works without a hitch.
The problem is that this particular line of code is unavoidable, and, unfortunately, it happens at the end of a fairly convoluted series of tests which I don't have the time to... I'm not sure what the word would be... strip down? I don't know... I'm sure you know what I mean.
In answer to a previous question, the one thing that is 100% reliable is that as it was (before my oddball switch workaround) the code DID crash at this spot every single time. Furthermore, in spite of the fact that the game uses a complex set of conditionals to get to this point, this point is inevitable, however, it's impossible now, to drive the game in a specific way TO this point because there is so much logic and a fair amount of random behavior that brings us (the code) here.
After the project is finished, I will definitely go back to see what I can strip out and find out what may have been causing this. I will post back to this thread when I do, so if you're getting e-mails when new posts arrive here, you can take a look.
Thank you all again for looking at this with me, I really appreciate it.
Edit 1:
It is SOOO much more enjoyable to chat with you folks about this problem than sit here crashing my phone, rebooting my computer and ripping hair out of my head.
Even if no one here offers me a path to a solution, I would still much prefer to just shoot the sh*t in reference to this problem than actually spend any more time hacking at it. In all my years of programming I have to admit, I have never EVER seen a bug like this before.

I have a hunch that the DBGPRINTF is causing the problem. Yeah, I know you did all the testing assuring all of us that it is definitely not the problem, but have you tried just printing "hello" and then doing your division as in your original code? The "%d" conversion of a "byte" just seems shaky to me.
-Aaron

I have a hunch that the DBGPRINTF is causing the problem. Yeah, I know you did all the testing assuring all of us that it is definitely not the problem, but have you tried just printing "hello" and then doing your division as in your original code? The "%d" conversion of a "byte" just seems shaky to me.
-Aaron

I agree with ai8.
Remove the DBGPRINTF or do what I do, incase it w/
#ifdef AEE_SIMULATOR
DBGPRINTF("Foo");
#endif
EDIT:
I agree w/ ai8 for different reasons, alot of phones don't support DBGPRINTF's on device, and I've seen DBGPRINTF crash some phones. The only phone I know off hand that I can use DBGPRINTF on is the Motorola C343 using the AppLogger tool.
Put ifdef's around all your DBGPRINTF's as a general rule of thumb or fear the rath of brew. :)

I agree with ai8.
Remove the DBGPRINTF or do what I do, incase it w/
#ifdef AEE_SIMULATOR
DBGPRINTF("Foo");
#endif
EDIT:
I agree w/ ai8 for different reasons, alot of phones don't support DBGPRINTF's on device, and I've seen DBGPRINTF crash some phones. The only phone I know off hand that I can use DBGPRINTF on is the Motorola C343 using the AppLogger tool.
Put ifdef's around all your DBGPRINTF's as a general rule of thumb or fear the rath of brew. :)

I'm not sure I agree with the DBGPRINTF being at fault; for one thing, I have made very extensive use of DBGPRINTF on this particular handset with no issues (except when I outputted a string that was too long) and also, the conversion from a byte is not the issue here, as the Dr has claimed that he has tried it with "slices" being an int.
I stand by my "Something's trampling over memory it shouldn't" assessment (either that, or the good Dr is the victim of a voodo curse! ;))
Of course, it's definitely worth trying if he hasn't done so already :)

I'm not sure I agree with the DBGPRINTF being at fault; for one thing, I have made very extensive use of DBGPRINTF on this particular handset with no issues (except when I outputted a string that was too long) and also, the conversion from a byte is not the issue here, as the Dr has claimed that he has tried it with "slices" being an int.
I stand by my "Something's trampling over memory it shouldn't" assessment (either that, or the good Dr is the victim of a voodo curse! ;))
Of course, it's definitely worth trying if he hasn't done so already :)

I don't remember off hand about how the T720 handles DBGPRINTF, but I do recall phones crashing as a result of the call, but your probably right, this isn't the solution to the problem.

I don't remember off hand about how the T720 handles DBGPRINTF, but I do recall phones crashing as a result of the call, but your probably right, this isn't the solution to the problem.

Dr.
The guys at Qualcomm are using the ARMulator, which is a very good emulator that works with native ARM code, as well as JTAG debuggers etc. While all these tools are available to Brew developers as well - even though they are pretty expensive - they are of little use because Qualcomm refuses to release certain information and data that would be required to run these tools, such as ROM images, etc. Shame, isn't it?

Dr.
The guys at Qualcomm are using the ARMulator, which is a very good emulator that works with native ARM code, as well as JTAG debuggers etc. While all these tools are available to Brew developers as well - even though they are pretty expensive - they are of little use because Qualcomm refuses to release certain information and data that would be required to run these tools, such as ROM images, etc. Shame, isn't it?

i thought i saw someone mention using that gdb stub again ?
yeah all the , dare i say important, qualcomm developers have jtag debuggers on their desks as well as the real simulator, but they won't give it out for fear of is all going off and ripping off their os/code and/or making phones that have free service ;) pity you can download software from that net that gives you most of the stuff they fear, except anythign useful to us.
its possible to get a JTAG enabled phone from qualcomm, i have a couple, but what you have to do to get them or quantity available i know not.

i thought i saw someone mention using that gdb stub again ?
yeah all the , dare i say important, qualcomm developers have jtag debuggers on their desks as well as the real simulator, but they won't give it out for fear of is all going off and ripping off their os/code and/or making phones that have free service ;) pity you can download software from that net that gives you most of the stuff they fear, except anythign useful to us.
its possible to get a JTAG enabled phone from qualcomm, i have a couple, but what you have to do to get them or quantity available i know not.

Yeah, the Qualcomm paranoia is quite distracting sometimes, especially if you're a professional developer who's used to getting access to material, documentation, hardware specs, hardware, and so forth. You would think that the stuff nVidia is doing is much more hardcore and bleeding edge than anything Qualcomm has ever done, and yet they will be more than happy to give you all the information you need. It's only a cell phone for Christ's sake...

Yeah, the Qualcomm paranoia is quite distracting sometimes, especially if you're a professional developer who's used to getting access to material, documentation, hardware specs, hardware, and so forth. You would think that the stuff nVidia is doing is much more hardcore and bleeding edge than anything Qualcomm has ever done, and yet they will be more than happy to give you all the information you need. It's only a cell phone for Christ's sake...

see... you know you've stumbled across something really NASTY when people start pointing fingers at printf() !!! :)
No... the problem is NOT in DBGPRINT(). There was no DBGPRINTF in there when the problem first showed up (why would there be, I don't go printing every value just for the hell of it). It took me almost two days of sprinkling out the DBGPRINTFs just to FIND the problem and then another few hours to allow myself to believe that that WAS the place where it was crashing (because I, just like you guys, was saying "NO F**ING WAY is a division crashing the phone!".
In any case, you don't have to convince me that it's a crappy workaround... I'm very VERY aware of that. However, I have not been able to find a reasonable alternative, short of rewriting the structure of the entire program (which may not fix the problem, seeing as how it's unclear what is actually CAUSING this division failure).

see... you know you've stumbled across something really NASTY when people start pointing fingers at printf() !!! :)
No... the problem is NOT in DBGPRINT(). There was no DBGPRINTF in there when the problem first showed up (why would there be, I don't go printing every value just for the hell of it). It took me almost two days of sprinkling out the DBGPRINTFs just to FIND the problem and then another few hours to allow myself to believe that that WAS the place where it was crashing (because I, just like you guys, was saying "NO F**ING WAY is a division crashing the phone!".
In any case, you don't have to convince me that it's a crappy workaround... I'm very VERY aware of that. However, I have not been able to find a reasonable alternative, short of rewriting the structure of the entire program (which may not fix the problem, seeing as how it's unclear what is actually CAUSING this division failure).

Dr. Dre'del,
I found a similar problem.
The code below, fall into link errors like ("_div3" "_moddiv" errors). I don't remember exactly.
void foo( int number )
{
int seconds = number / 1000 ;
int milliseconds = number % 1000 ;
some DrawText ...

So, I removed the literal const from division:
void foo( int number )
{
int base = 1000 ;
int seconds = number / base ;
int milliseconds = number % base ;
some DrawText ...

It works for me, try to do that...
int seconds = number / (int)1000, doesn't work...
I'm using gcc 3.3.1.
I hope this help.

Dr. Dre'del,
I found a similar problem.
The code below, fall into link errors like ("_div3" "_moddiv" errors). I don't remember exactly.
void foo( int number )
{
int seconds = number / 1000 ;
int milliseconds = number % 1000 ;
some DrawText ...

So, I removed the literal const from division:
void foo( int number )
{
int base = 1000 ;
int seconds = number / base ;
int milliseconds = number % base ;
some DrawText ...

It works for me, try to do that...
int seconds = number / (int)1000, doesn't work...
I'm using gcc 3.3.1.
I hope this help.

that's VERY funny! I like it! If you're not joking, I still thank you for the laugh.
If you read through the entire thread you would have noticed that I have the EXACT opposite situation!
My hard coded constants divide without a problem (now on both the T-720 AND LG4600, while the variable versions crash each time.
The interesting thing would be to see what else you and I are doing in our code, that may be similar, and to find out why (if there are similarities) they would produce such an odd side effect.

that's VERY funny! I like it! If you're not joking, I still thank you for the laugh.
If you read through the entire thread you would have noticed that I have the EXACT opposite situation!
My hard coded constants divide without a problem (now on both the T-720 AND LG4600, while the variable versions crash each time.
The interesting thing would be to see what else you and I are doing in our code, that may be similar, and to find out why (if there are similarities) they would produce such an odd side effect.

I'm not joking... but, I fall into this problem at linkage...
Try to use uint32 for all vars instead of uint32 and byte...
I don't known, maybe the target handset is messing the alignment used by the compiler/linker...
PS.: When I change "int base = 1000" to "const int base = 1000" I fall into the same problem.

I'm not joking... but, I fall into this problem at linkage...
Try to use uint32 for all vars instead of uint32 and byte...
I don't known, maybe the target handset is messing the alignment used by the compiler/linker...
PS.: When I change "int base = 1000" to "const int base = 1000" I fall into the same problem.

which compiler versions are you two using ?

which compiler versions are you two using ?

gnude gcc 3.3.1

gnude gcc 3.3.1

I'm also using GNUDE, not sure how to determine the version.
As for the var types, as I said, if you read through the entire thread you will see that ANY var type causes this problem other than naked static numbers.
to say that this is bizarre is a grave understatement

I'm also using GNUDE, not sure how to determine the version.
As for the var types, as I said, if you read through the entire thread you will see that ANY var type causes this problem other than naked static numbers.
to say that this is bizarre is a grave understatement

Have you tried making as minimal an application as possible and seeing if the error occurs?
Maybe you're using too much stack or you have a wild store in your code. It's possible there's a bug in the library code, but it's surprising that everyone who uses GCC and does a divide (for ARM) doesn't complain about the bug.
my $.02

Have you tried making as minimal an application as possible and seeing if the error occurs?
Maybe you're using too much stack or you have a wild store in your code. It's possible there's a bug in the library code, but it's surprising that everyone who uses GCC and does a divide (for ARM) doesn't complain about the bug.
my $.02

mykes, yeah... the bug is not IN the division, and the same division works fine elsehwere. However, there's something leading up to the division that makes the division crash. The whole point of this thread was to find out if anyone else had a similar problem and what the root of it may have been.
I tried moving stuff that precedes the bug line into the stack and then into the heap, but that doesn't seem to affect it. So, I'm out of ideas.

mykes, yeah... the bug is not IN the division, and the same division works fine elsehwere. However, there's something leading up to the division that makes the division crash. The whole point of this thread was to find out if anyone else had a similar problem and what the root of it may have been.
I tried moving stuff that precedes the bug line into the stack and then into the heap, but that doesn't seem to affect it. So, I'm out of ideas.

Again, my $.02
You have a serious bug in YOUR logic. It'd be best to find it ;-)

Again, my $.02
You have a serious bug in YOUR logic. It'd be best to find it ;-)

I doubt there's a bug in my logic, what is much more likely is that something I'm doing (although completey valid and legal) is trampling over memory and causing this issue. There is really nothing you should be able to do that would result in a situation where suddenly the processor can't divide two values where neither is 0 (and the second is a variable rather than a hard coded digit). You must admit that if I were to challenge you to CREATE a bug that causes such a behavior, you would not be able to arrange it. :)
But I appreciate your advice, and am sure you're right in that I do need to find what is actually causing this problem... of course, you can imagine that finding such a thing is fantastically difficult since it can be absolutely ANYTHING in the code leading up to that point... and there's a LOT of code!

I doubt there's a bug in my logic, what is much more likely is that something I'm doing (although completey valid and legal) is trampling over memory and causing this issue. There is really nothing you should be able to do that would result in a situation where suddenly the processor can't divide two values where neither is 0 (and the second is a variable rather than a hard coded digit). You must admit that if I were to challenge you to CREATE a bug that causes such a behavior, you would not be able to arrange it. :)
But I appreciate your advice, and am sure you're right in that I do need to find what is actually causing this problem... of course, you can imagine that finding such a thing is fantastically difficult since it can be absolutely ANYTHING in the code leading up to that point... and there's a LOT of code!

The ARM processor doesn't have a divide instruction. Division is done by subroutine. If you trample on memory, maybe you've destroyed the instructions in the subroutine? Or if you're using too much stack, your autos and stack arguments can be in Brew's variable space (or your own)?
Those are two things off the top of my head I'd be looking for.
Another is you're writing to memory you've already freed and that's in use by something else.
It's pretty clear that if a minimal application works, then something in your more complex application is broken.
I suggest you use #if 0 .. #endif around as much of your code as you can, then remove them one at a time until you see what's causing the bug.
Good luck.

The ARM processor doesn't have a divide instruction. Division is done by subroutine. If you trample on memory, maybe you've destroyed the instructions in the subroutine? Or if you're using too much stack, your autos and stack arguments can be in Brew's variable space (or your own)?
Those are two things off the top of my head I'd be looking for.
Another is you're writing to memory you've already freed and that's in use by something else.
It's pretty clear that if a minimal application works, then something in your more complex application is broken.
I suggest you use #if 0 .. #endif around as much of your code as you can, then remove them one at a time until you see what's causing the bug.
Good luck.

Have you tried rearranging the order of your subroutines?
That is, move the subroutine in question to a different place in the file.

Have you tried rearranging the order of your subroutines?
That is, move the subroutine in question to a different place in the file.

An earlier discussion mentioned DBGPRINTF("%d", val) where val is an uint8. I've seen that causing crashes on other system than BREW.
Another thing to note if you are using C++ code is that if you delete "this", you can still access the member variables and your code won't crash until much later. That's right, on some system, accessing member variables after "this" is gone is a surefire crash, but not on Brew. I had one instance where my app continued happily most of the times and would only crash once in a while...
Your only way out is to track down each delete and free and see if you access the deleted/freed stuff somewhere...
Also, watch out for cases where you access variables after they have gone out of scope, like referencing a string that was created on a stack.
The action below including adding printfs to make the code work around a crash is asking for trouble because you are only masking the problem.
mykes wrote:Have you tried rearranging the order of your subroutines?
That is, move the subroutine in question to a different place in the file.

An earlier discussion mentioned DBGPRINTF("%d", val) where val is an uint8. I've seen that causing crashes on other system than BREW.
Another thing to note if you are using C++ code is that if you delete "this", you can still access the member variables and your code won't crash until much later. That's right, on some system, accessing member variables after "this" is gone is a surefire crash, but not on Brew. I had one instance where my app continued happily most of the times and would only crash once in a while...
Your only way out is to track down each delete and free and see if you access the deleted/freed stuff somewhere...
Also, watch out for cases where you access variables after they have gone out of scope, like referencing a string that was created on a stack.
The action below including adding printfs to make the code work around a crash is asking for trouble because you are only masking the problem.
mykes wrote:Have you tried rearranging the order of your subroutines?
That is, move the subroutine in question to a different place in the file.

Here is another check: make sure all your string operations include the "null" character. If you are malloc'ing space for a string, make sure to do a "+1". For wchar, (sizeof(wchar) * (len + 1))..., but hopefully the "check list" will help you out...
accolade wrote:An earlier discussion mentioned DBGPRINTF("%d", val) where val is an uint8. I've seen that causing crashes on other system than BREW.
Another thing to note if you are using C++ code is that if you delete "this", you can still access the member variables and your code won't crash until much later. That's right, on some system, accessing member variables after "this" is gone is a surefire crash, but not on Brew. I had one instance where my app continued happily most of the times and would only crash once in a while...
Your only way out is to track down each delete and free and see if you access the deleted/freed stuff somewhere...
Also, watch out for cases where you access variables after they have gone out of scope, like referencing a string that was created on a stack.
The action below including adding printfs to make the code work around a crash is asking for trouble because you are only masking the problem.

Here is another check: make sure all your string operations include the "null" character. If you are malloc'ing space for a string, make sure to do a "+1". For wchar, (sizeof(wchar) * (len + 1))..., but hopefully the "check list" will help you out...
accolade wrote:An earlier discussion mentioned DBGPRINTF("%d", val) where val is an uint8. I've seen that causing crashes on other system than BREW.
Another thing to note if you are using C++ code is that if you delete "this", you can still access the member variables and your code won't crash until much later. That's right, on some system, accessing member variables after "this" is gone is a surefire crash, but not on Brew. I had one instance where my app continued happily most of the times and would only crash once in a while...
Your only way out is to track down each delete and free and see if you access the deleted/freed stuff somewhere...
Also, watch out for cases where you access variables after they have gone out of scope, like referencing a string that was created on a stack.
The action below including adding printfs to make the code work around a crash is asking for trouble because you are only masking the problem.

I'm not sure if you are going to see a resolution to your problem in this thread, dear Doctor, so I propose that we derail it slightly with war-stories about our favourite and most inexplicable bugs, and how we finally unearthed and eradicated them. Who knows - maybe a case-study detailed by one of us will have some striking similarity with your own problem, and lead you to a solution. At the very least, the feel-good stories of triumph may help to keep you inspired during your quest, and perhaps provide you all with some small entertainment :)
I started BREWing back in February, and my first ever BREW task, in my first-ever job, in a company that had never developed BREW apps in-house, was to convert a rather sprawling J2ME game to BREW 1.1 (I'm writing all this bumph in the hopes that you won't think too badly of me when I detail the catalogue of disasters and poor design choices I made :)). I chose to write in C++, as I'm more comfortable with C's successor than with C itself, and the syntactic and conceptual similarites between Java and C++ should have made conversion a breeze. Having made this central design choice, I set about making my first skeletal C++ BREW app, and then started laying the groundwork that would be required for converting the game - mainly little details like a "safe" array class that would allow me to emulate Java's cavalier "new-and-forget" approach to memory management, a class that emulated key aspects of Java's Graphics class, etc; the kind of preparations that, once completed, should make conversion into BREW as thoughtless and automatic a task as I could manage.
Everything went fantasically, eerily well - my first 15 working days of BREW were spent experimenting with graphics drawing operations, conducting time-trials, writing my mini-Java Emulation layer, etc etc - one-off tasks that I'll never need to do again. A week after these preliminaries were complete, the game was up and running - sans fancy front-end, of course, but this toppled with spooky rapidity the following week. Most of the conversion, now that the Java emulation was in place, consisted of making very simple syntantic changes, line by line - the kind of thing a well-trained monkey, or perhaps a CS graduate (I kid! I kid! :)) could do. In retrospect, things would have been much easier and quicker if I had written a perl script to apply the changes, and do the minor intelligence-requiring clean up afterwards, but you know how things go - you start off, get to the 1/2 way mark (which, in actually, almost always turns out to be closer to the 1/10-way mark), think to yourself "Weeeell...I could stop doing this and just write a script instead, but I'm halfway done already so I might as well just carry on...".
So everything was going incredibly well; a few minor bugs here and there, but nothing that wasn't quashed in a few minutes - a couple of hours, tops. Until, that is, a minor & entirely innocous change to an obscure bit of code unleashed The Bug From Hell.
Woo - this is turning out to be longer and more verbose than I had anticipated. I'm loathe to clutter up the thread with off-topic stuff like this, but if anyone wants to hear the rest, less me know :)

I'm not sure if you are going to see a resolution to your problem in this thread, dear Doctor, so I propose that we derail it slightly with war-stories about our favourite and most inexplicable bugs, and how we finally unearthed and eradicated them. Who knows - maybe a case-study detailed by one of us will have some striking similarity with your own problem, and lead you to a solution. At the very least, the feel-good stories of triumph may help to keep you inspired during your quest, and perhaps provide you all with some small entertainment :)
I started BREWing back in February, and my first ever BREW task, in my first-ever job, in a company that had never developed BREW apps in-house, was to convert a rather sprawling J2ME game to BREW 1.1 (I'm writing all this bumph in the hopes that you won't think too badly of me when I detail the catalogue of disasters and poor design choices I made :)). I chose to write in C++, as I'm more comfortable with C's successor than with C itself, and the syntactic and conceptual similarites between Java and C++ should have made conversion a breeze. Having made this central design choice, I set about making my first skeletal C++ BREW app, and then started laying the groundwork that would be required for converting the game - mainly little details like a "safe" array class that would allow me to emulate Java's cavalier "new-and-forget" approach to memory management, a class that emulated key aspects of Java's Graphics class, etc; the kind of preparations that, once completed, should make conversion into BREW as thoughtless and automatic a task as I could manage.
Everything went fantasically, eerily well - my first 15 working days of BREW were spent experimenting with graphics drawing operations, conducting time-trials, writing my mini-Java Emulation layer, etc etc - one-off tasks that I'll never need to do again. A week after these preliminaries were complete, the game was up and running - sans fancy front-end, of course, but this toppled with spooky rapidity the following week. Most of the conversion, now that the Java emulation was in place, consisted of making very simple syntantic changes, line by line - the kind of thing a well-trained monkey, or perhaps a CS graduate (I kid! I kid! :)) could do. In retrospect, things would have been much easier and quicker if I had written a perl script to apply the changes, and do the minor intelligence-requiring clean up afterwards, but you know how things go - you start off, get to the 1/2 way mark (which, in actually, almost always turns out to be closer to the 1/10-way mark), think to yourself "Weeeell...I could stop doing this and just write a script instead, but I'm halfway done already so I might as well just carry on...".
So everything was going incredibly well; a few minor bugs here and there, but nothing that wasn't quashed in a few minutes - a couple of hours, tops. Until, that is, a minor & entirely innocous change to an obscure bit of code unleashed The Bug From Hell.
Woo - this is turning out to be longer and more verbose than I had anticipated. I'm loathe to clutter up the thread with off-topic stuff like this, but if anyone wants to hear the rest, less me know :)

oh, COME ON! This is a ploy, right? This is your own unique take on Amazon.com's "here's the first chapter of this book you'll undoubtedly pay whatever we ask for, once you've been exposed to this snippit of introduction".
What happened?! I need to know! Name your price, you dastardly story weaver... I'll give you unfettered access to my paypal account.
btw, this thread is, as far as I'm concerned, no longer viable as a source of solution to my problem. Two weeks have gone by and I have done every possible thing to weed out the problem and it is no closer now than it was on day one.
Some of my attempts included trying out a % operation (same exact result, and same exact solution... it'll take a static number but not a variable).
moving the entire method to a completely different portion of the program which is executed subsequent to a timer call (and elaborate release of memory)...
an abolition of passing the byte value in, and instead passing a pointer to the original byte
as well as a number of other less simply explainable attacks.
If my ugly workaround (with which I started this thread) wasn't COMPLETELY successfull (to all appearances) I would not be writing this, as I would have already killed myself.
So... worry not about cluttering an already cluttered thread (for dog doo's sake, agonized posters started pointing fingers at printf a page ago! you know the useful ideas have run their corse when printf is being accused of being buggy).
Lasty... your story is going to be particularly interesting to me because as soon as I'm done with the brew version (hopefully next week), I have to... yep, you guessed it... convert this baby to J2ME. And of course I don't have the luxury of a class for class conversion because my version was written in C... so... I'm sure I'll be posting all sorts of war stories here very soon.

oh, COME ON! This is a ploy, right? This is your own unique take on Amazon.com's "here's the first chapter of this book you'll undoubtedly pay whatever we ask for, once you've been exposed to this snippit of introduction".
What happened?! I need to know! Name your price, you dastardly story weaver... I'll give you unfettered access to my paypal account.
btw, this thread is, as far as I'm concerned, no longer viable as a source of solution to my problem. Two weeks have gone by and I have done every possible thing to weed out the problem and it is no closer now than it was on day one.
Some of my attempts included trying out a % operation (same exact result, and same exact solution... it'll take a static number but not a variable).
moving the entire method to a completely different portion of the program which is executed subsequent to a timer call (and elaborate release of memory)...
an abolition of passing the byte value in, and instead passing a pointer to the original byte
as well as a number of other less simply explainable attacks.
If my ugly workaround (with which I started this thread) wasn't COMPLETELY successfull (to all appearances) I would not be writing this, as I would have already killed myself.
So... worry not about cluttering an already cluttered thread (for dog doo's sake, agonized posters started pointing fingers at printf a page ago! you know the useful ideas have run their corse when printf is being accused of being buggy).
Lasty... your story is going to be particularly interesting to me because as soon as I'm done with the brew version (hopefully next week), I have to... yep, you guessed it... convert this baby to J2ME. And of course I don't have the luxury of a class for class conversion because my version was written in C... so... I'm sure I'll be posting all sorts of war stories here very soon.

Dr Dre'dels rash words of encouragement have doomed you all to a further retelling of my tale of Treachery and Woe, so without further ado I offer up
My Crappy Bug Part II
In Whiche Our Hero Uncovers a Crappy Bug, and Muche Despair Ensues
This next block doubles as both a clumsy piece of foreshadowing and a cautionary tale for those making the leap from J2ME to BREW.
The central core of any J2ME application (i.e. that part which deals with construction, initalisation etc) always inherits from the Midlet class. The corresponding central core of a BREW C++ project inherits from AEEApplet. Taking this correspondence much, much further than I had any right to, I decided that the first task of my conversion should be to shape my AEEApplet class in the image of the original Midlet class. For reasons that I'll expound upon a little later (they do not involve stupidity of the author of the original J2ME version, who has a 1st class Maths degree from Cambridge(!) and is an all-round smart guy), this Midlet-derived class had approximately 4000 lines of code, and God only knows how many members variable. Having spent a couple of tedious days transferring/ amending the original code so that it sat comfortably in my AEEApplet class, I stumbled upon the wonderful fact that classes that inherit from AEEApplet are in fact not constructed, per se; they are simply allocated, so all of the vast droves of member variables were in turn not getting properly constructed/ initialised at startup.
"Fiddle-de-dee," I thought to myself, "now I shall have to write a pseudo-constructor/ destructor, and for each of the gazillions of the member variables, I shall have to replace their declaration in the class with a pointer to the class, and manually construct all of these members in my pseudo-constructor, and destroy them in my destructor. Oh, and change all of the code to reflect the fact that all references must be changed to pointers"*
You are now doubt all on the edge of your seats, having suffered total shutdown of major parts of the brain including those required to maintain the proper upright posture. So I shall now resume my thrilling narrative proper.
The game had been running flawlessly for a good few days now, so I spent a little time making minor tweaks and suchlike: prodding a routine until it yielded up a little extra speed; tidying up the code here and there, etc.: altering things that are quite remote from the core game logic. Casually testing out the latest build, only differing slightly and in very minor ways from the previous, I was met by an Access Violation upon app closure.
"Botheration," I thought to myself, "this will no doubt prove to be quite a chore to fix"**
Probing further with VC++'s superb debugging facilities (which I miss now that I am currently working in J2ME :(), the problem seemed to stem from calling delete on a pointer to a certain member of a certain class. Probing still further, and with the knowledge that this particular member should have already been deleted and NULLified at this stage raising the hairs on the back of my neck, I noticed that the value of this pointer was - here it comes - 1. Not the comforting regularity of the 0xcdcdcdcd indicating that the pointer has never been initialised in the first place, or the slightly less comforting pseudo-random value reminiscent of a valid pointer to memory which has seen been de-allocated, but 1. Uno. Ein. The pointer, you could say, had the value 1. This is very odd, and so I did that weirdly superstitious thing that otherwise rational programmers often do - I ran the thing again, in the vain hope that this bizarre error was caused by a stray neutrino interacting with my processor, or the particular alignment of Saturn in the aspect of Jupiter at that particular millisecond in time, or perhaps the infinitecimal shift in gravitational pull on my harddrive caused by a goat in Venezuala moving to graze a different patch of grass - you know, the Usual Suspects.
Same result, despite the fact that neutrino-matter interactions are shockingly rare, that Saturn had probably moved several kilometers since the last run, and that our goat could surely not have exhausted his fresh grass supply so soon.
"Heavens to Betsy," I thought to myself, "it seems that something utterly and completely inexplicable has occurred, which will no doubt cost me many hours of tedious slog to track down".
The most obvious culprit was that perhaps I had accidentally passed this pointer to a function that increments one of the arguments it received (remember, by now the code was a horrid mish-mash of pointers and references and everything else in between, so this was not entirely unlikely). Poring over the code, I could see no conceivable way this could occur. I exhausted a whole bunch of increasingly unlikely hypotheses, and got to the point where I doubted both the compiler and my sanity. There was simply no possible way that this pointer could be assigned such a neat, well-formed value. None. It made no logical sense. Soaked in frustration, and despairing at the fact that the one tool that I could not do without was malfunctioning and turning against me, I slunk home to horrid dreams of laughing pointers spontaneously assuming the value of unity, just to mock me.
Stay tuned for the next (and thankfully, final) installment of my dreadful monolgue :)
* "Fiddle-de-dee" has replaced the expletive beginning with the same letter. Oh, and you know those disastrous design choices I alluded to earlier? This is probably the biggest one. For the love of all that is holy, kids, please don't make the same crashingly stupid mistake I did :(.
** Yes, my internal monologue really does often take this form.

Dr Dre'dels rash words of encouragement have doomed you all to a further retelling of my tale of Treachery and Woe, so without further ado I offer up
My Crappy Bug Part II
In Whiche Our Hero Uncovers a Crappy Bug, and Muche Despair Ensues
This next block doubles as both a clumsy piece of foreshadowing and a cautionary tale for those making the leap from J2ME to BREW.
The central core of any J2ME application (i.e. that part which deals with construction, initalisation etc) always inherits from the Midlet class. The corresponding central core of a BREW C++ project inherits from AEEApplet. Taking this correspondence much, much further than I had any right to, I decided that the first task of my conversion should be to shape my AEEApplet class in the image of the original Midlet class. For reasons that I'll expound upon a little later (they do not involve stupidity of the author of the original J2ME version, who has a 1st class Maths degree from Cambridge(!) and is an all-round smart guy), this Midlet-derived class had approximately 4000 lines of code, and God only knows how many members variable. Having spent a couple of tedious days transferring/ amending the original code so that it sat comfortably in my AEEApplet class, I stumbled upon the wonderful fact that classes that inherit from AEEApplet are in fact not constructed, per se; they are simply allocated, so all of the vast droves of member variables were in turn not getting properly constructed/ initialised at startup.
"Fiddle-de-dee," I thought to myself, "now I shall have to write a pseudo-constructor/ destructor, and for each of the gazillions of the member variables, I shall have to replace their declaration in the class with a pointer to the class, and manually construct all of these members in my pseudo-constructor, and destroy them in my destructor. Oh, and change all of the code to reflect the fact that all references must be changed to pointers"*
You are now doubt all on the edge of your seats, having suffered total shutdown of major parts of the brain including those required to maintain the proper upright posture. So I shall now resume my thrilling narrative proper.
The game had been running flawlessly for a good few days now, so I spent a little time making minor tweaks and suchlike: prodding a routine until it yielded up a little extra speed; tidying up the code here and there, etc.: altering things that are quite remote from the core game logic. Casually testing out the latest build, only differing slightly and in very minor ways from the previous, I was met by an Access Violation upon app closure.
"Botheration," I thought to myself, "this will no doubt prove to be quite a chore to fix"**
Probing further with VC++'s superb debugging facilities (which I miss now that I am currently working in J2ME :(), the problem seemed to stem from calling delete on a pointer to a certain member of a certain class. Probing still further, and with the knowledge that this particular member should have already been deleted and NULLified at this stage raising the hairs on the back of my neck, I noticed that the value of this pointer was - here it comes - 1. Not the comforting regularity of the 0xcdcdcdcd indicating that the pointer has never been initialised in the first place, or the slightly less comforting pseudo-random value reminiscent of a valid pointer to memory which has seen been de-allocated, but 1. Uno. Ein. The pointer, you could say, had the value 1. This is very odd, and so I did that weirdly superstitious thing that otherwise rational programmers often do - I ran the thing again, in the vain hope that this bizarre error was caused by a stray neutrino interacting with my processor, or the particular alignment of Saturn in the aspect of Jupiter at that particular millisecond in time, or perhaps the infinitecimal shift in gravitational pull on my harddrive caused by a goat in Venezuala moving to graze a different patch of grass - you know, the Usual Suspects.
Same result, despite the fact that neutrino-matter interactions are shockingly rare, that Saturn had probably moved several kilometers since the last run, and that our goat could surely not have exhausted his fresh grass supply so soon.
"Heavens to Betsy," I thought to myself, "it seems that something utterly and completely inexplicable has occurred, which will no doubt cost me many hours of tedious slog to track down".
The most obvious culprit was that perhaps I had accidentally passed this pointer to a function that increments one of the arguments it received (remember, by now the code was a horrid mish-mash of pointers and references and everything else in between, so this was not entirely unlikely). Poring over the code, I could see no conceivable way this could occur. I exhausted a whole bunch of increasingly unlikely hypotheses, and got to the point where I doubted both the compiler and my sanity. There was simply no possible way that this pointer could be assigned such a neat, well-formed value. None. It made no logical sense. Soaked in frustration, and despairing at the fact that the one tool that I could not do without was malfunctioning and turning against me, I slunk home to horrid dreams of laughing pointers spontaneously assuming the value of unity, just to mock me.
Stay tuned for the next (and thankfully, final) installment of my dreadful monolgue :)
* "Fiddle-de-dee" has replaced the expletive beginning with the same letter. Oh, and you know those disastrous design choices I alluded to earlier? This is probably the biggest one. For the love of all that is holy, kids, please don't make the same crashingly stupid mistake I did :(.
** Yes, my internal monologue really does often take this form.

I'll wait to comment on the bulk of your sordid tale until it's completed, but I thought I would interject with a mention of the Eclipse IDE, which, if you're not familliar with, you should become so post haste. Eclipse makes VC++ look like a hack slapped together by recently defrosted neandrethals (and, as one might expect with a truly good software, it's free).
I haven't started writing my J2ME in it yet, so I can't vouch for the J2ME plug-in, but certainly when it comes to any and all J2EE uses (including JSTL, Beans, servlets, and everything in between) it's absolutely marvelous. And from what I understand, someone has actually written a C++ compiler plug-in for it which I also have not tried, but will be looking at as soon as I'm done with my current project.
BTW, what region of the world do you live in, and would you be interested in/have time for some freelance work?

I'll wait to comment on the bulk of your sordid tale until it's completed, but I thought I would interject with a mention of the Eclipse IDE, which, if you're not familliar with, you should become so post haste. Eclipse makes VC++ look like a hack slapped together by recently defrosted neandrethals (and, as one might expect with a truly good software, it's free).
I haven't started writing my J2ME in it yet, so I can't vouch for the J2ME plug-in, but certainly when it comes to any and all J2EE uses (including JSTL, Beans, servlets, and everything in between) it's absolutely marvelous. And from what I understand, someone has actually written a C++ compiler plug-in for it which I also have not tried, but will be looking at as soon as I'm done with my current project.
BTW, what region of the world do you live in, and would you be interested in/have time for some freelance work?

Dr.Dre'del wrote:I'll wait to comment on the bulk of your sordid tale until it's completed, but I thought I would interject with a mention of the Eclipse IDE, which, if you're not familliar with, you should become so post haste. Eclipse makes VC++ look like a hack slapped together by recently defrosted neandrethals (and, as one might expect with a truly good software, it's free).
I haven't started writing my J2ME in it yet, so I can't vouch for the J2ME plug-in, but certainly when it comes to any and all J2EE uses (including JSTL, Beans, servlets, and everything in between) it's absolutely marvelous. And from what I understand, someone has actually written a C++ compiler plug-in for it which I also have not tried, but will be looking at as soon as I'm done with my current project.
Oh, yes - when I was experimenting with Java, just when I was starting out, Eclipse was suggested to me and it rocks. The most amazing thing about it is its "check for errors while you're working" feature - absolutely astonishing, and extremely helpful when you are learning. Sadly, it does not cope well with having pre-processor operations scattered throughout the code (which is often a necessity for developing for multiple handsets, most of which have quirky J2ME implementations that require handset-specific workarounds), so I just stick with JCreator. I've never used it with C++, but I'll try to check it out when I get the chance.
Oh, and I agree with your comment about free software - FireFox, LaTeX, gaim, cygwin - superb pieces of software that put their commercial/ close-source rivals (as applicable) to shame. I simply cannot make it throught the day without Firefox and Cygwin - all of my resource building for any game I write is always done in perl using unix-like utilities. God bless the open-source movement! :)
Dr.Dre'del wrote:
BTW, what region of the world do you live in, and would you be interested in/have time for some freelance work?
I'm in the UK, working for one of the premier mobile game development studios (iomo.co.uk) in this country - the core of the company used to develop for the PC, and are responsible for Carmageddon amongst other things :cool:. Sadly, I am utterly snowed under at the moment - and I'm not sure my employers would look kindly on my taking on additional work from elsewhere ;). Thanks a lot for the offer, though (if offer it was!) - as someone who is just starting out in commerical programming, it's a nice confidence boost :)
The third and final part of my travesty wrought upon mankind contains a few more snippets about the difficulties of J2ME games development; I'll write it up when I get the chance :)

Dr.Dre'del wrote:I'll wait to comment on the bulk of your sordid tale until it's completed, but I thought I would interject with a mention of the Eclipse IDE, which, if you're not familliar with, you should become so post haste. Eclipse makes VC++ look like a hack slapped together by recently defrosted neandrethals (and, as one might expect with a truly good software, it's free).
I haven't started writing my J2ME in it yet, so I can't vouch for the J2ME plug-in, but certainly when it comes to any and all J2EE uses (including JSTL, Beans, servlets, and everything in between) it's absolutely marvelous. And from what I understand, someone has actually written a C++ compiler plug-in for it which I also have not tried, but will be looking at as soon as I'm done with my current project.
Oh, yes - when I was experimenting with Java, just when I was starting out, Eclipse was suggested to me and it rocks. The most amazing thing about it is its "check for errors while you're working" feature - absolutely astonishing, and extremely helpful when you are learning. Sadly, it does not cope well with having pre-processor operations scattered throughout the code (which is often a necessity for developing for multiple handsets, most of which have quirky J2ME implementations that require handset-specific workarounds), so I just stick with JCreator. I've never used it with C++, but I'll try to check it out when I get the chance.
Oh, and I agree with your comment about free software - FireFox, LaTeX, gaim, cygwin - superb pieces of software that put their commercial/ close-source rivals (as applicable) to shame. I simply cannot make it throught the day without Firefox and Cygwin - all of my resource building for any game I write is always done in perl using unix-like utilities. God bless the open-source movement! :)
Dr.Dre'del wrote:
BTW, what region of the world do you live in, and would you be interested in/have time for some freelance work?
I'm in the UK, working for one of the premier mobile game development studios (iomo.co.uk) in this country - the core of the company used to develop for the PC, and are responsible for Carmageddon amongst other things :cool:. Sadly, I am utterly snowed under at the moment - and I'm not sure my employers would look kindly on my taking on additional work from elsewhere ;). Thanks a lot for the offer, though (if offer it was!) - as someone who is just starting out in commerical programming, it's a nice confidence boost :)
The third and final part of my travesty wrought upon mankind contains a few more snippets about the difficulties of J2ME games development; I'll write it up when I get the chance :)

simon wrote:Eclipse was suggested to me and it rocks
I will second that! Eclipse really is quite a piece once you learn how to use it. And the plugins functionality allows you to do virtually anything out of it, using well defined and behaved APIs.
After using its JDT for Java development, I've tried usind CDT, the C development environment plugin for Eclipse, and was a little let down - it is ages behind its JAVA counterpart. But the guys are working on it, and pretty soon I'm sure it will be a good replacement for whatever C IDE we use today when developing for the BREW emulator.

simon wrote:Eclipse was suggested to me and it rocks
I will second that! Eclipse really is quite a piece once you learn how to use it. And the plugins functionality allows you to do virtually anything out of it, using well defined and behaved APIs.
After using its JDT for Java development, I've tried usind CDT, the C development environment plugin for Eclipse, and was a little let down - it is ages behind its JAVA counterpart. But the guys are working on it, and pretty soon I'm sure it will be a good replacement for whatever C IDE we use today when developing for the BREW emulator.

Ok, I went back and stripped the program down to its underwear and finally stumbled across what's causing it.
You ready?
MALLOC.
No... no... don't argue with me... it's MALLOC.
check it out...
byte foo, bar;
foo = 30;
bar = 5;
foo %= bar; //THIS WORKS FINE
//here we put a call to MALLOC (I won't type it out... but trust me, it's valid)
foo = 30;
bar = 5;
foo %= bar; // HERE THE PHONE CRASHES
All this code is happening in the APP_START case of the appName_HandleEvent() method switch. There is NOTHING else in the program.
So... I'm using GNUDE and can only imagine that this is the problem. Can anyone else who is using GNUDE run this simple test (write a program that mallocs some memory then try to divide a variable by another variable ((It's imperative that the divisor be a variable, because if it's a static number then the phone doesn't crash))).
Is there an alternative to MALLOC? I mean, is there some other library method I can try to see if GNUDE compiles that better?
Now, before you all yell at me for blaming GNUDE, please understand that I have tried every variation of the "fix" and there is no question that the MALLOC is working correctly (it allocates the memory I ask for) and that the whole program works fine, with the one exception that after using MALLOC I can no longer divide one variable by another (or modulous it, since that's also a division).
As always... any and all advice is welcome (except if you're going to tell me that what I'm saying is "impossible" because I know it sounds crazy... but it is what it is.
edit 1. oh, and as before, all this works like whipped cream on a Sundae in the emulator, which is why I'm blaming GNUDE and not BREW :)
edit 2. Just tried IHEAP_Malloc()... same exact result. Sheesh.

Ok, I went back and stripped the program down to its underwear and finally stumbled across what's causing it.
You ready?
MALLOC.
No... no... don't argue with me... it's MALLOC.
check it out...
byte foo, bar;
foo = 30;
bar = 5;
foo %= bar; //THIS WORKS FINE
//here we put a call to MALLOC (I won't type it out... but trust me, it's valid)
foo = 30;
bar = 5;
foo %= bar; // HERE THE PHONE CRASHES
All this code is happening in the APP_START case of the appName_HandleEvent() method switch. There is NOTHING else in the program.
So... I'm using GNUDE and can only imagine that this is the problem. Can anyone else who is using GNUDE run this simple test (write a program that mallocs some memory then try to divide a variable by another variable ((It's imperative that the divisor be a variable, because if it's a static number then the phone doesn't crash))).
Is there an alternative to MALLOC? I mean, is there some other library method I can try to see if GNUDE compiles that better?
Now, before you all yell at me for blaming GNUDE, please understand that I have tried every variation of the "fix" and there is no question that the MALLOC is working correctly (it allocates the memory I ask for) and that the whole program works fine, with the one exception that after using MALLOC I can no longer divide one variable by another (or modulous it, since that's also a division).
As always... any and all advice is welcome (except if you're going to tell me that what I'm saying is "impossible" because I know it sounds crazy... but it is what it is.
edit 1. oh, and as before, all this works like whipped cream on a Sundae in the emulator, which is why I'm blaming GNUDE and not BREW :)
edit 2. Just tried IHEAP_Malloc()... same exact result. Sheesh.

I would say that you really need to dump the asm to see what is going on. Also, turn off optimizations (for this test), take a look and make sure you check alignment issues, as in,
http://brewforums.qualcomm.com/showthread.php?t=1919&highlight=alignment
I suspect the line(s) you left out above is/are causing the problem. However, we will never know until they are are supplied.
I suspect if you post all the code in its completeness, the problem will reveal itself.
You might think its completely stripped down, but what about code in the Init (and free) sections?
I've done alot of testing with the T720 and lots and lots of MALLOCs. I can tell you that it will absolutely crash in certain cases when alignment issues (related to memory read or write) come up.
BTW, does this happen when foo and bar are 32 bit ints?

I would say that you really need to dump the asm to see what is going on. Also, turn off optimizations (for this test), take a look and make sure you check alignment issues, as in,
http://brewforums.qualcomm.com/showthread.php?t=1919&highlight=alignment
I suspect the line(s) you left out above is/are causing the problem. However, we will never know until they are are supplied.
I suspect if you post all the code in its completeness, the problem will reveal itself.
You might think its completely stripped down, but what about code in the Init (and free) sections?
I've done alot of testing with the T720 and lots and lots of MALLOCs. I can tell you that it will absolutely crash in certain cases when alignment issues (related to memory read or write) come up.
BTW, does this happen when foo and bar are 32 bit ints?

jmiller,
optimization is turned off.
Variable type is not relevant (though if you start at the top of this thead you will note that the crash goes away when the divisor is changed from a variable to a hard coded number (i.e. 7).
Aside from freeing the memory I am mallocing (all 10 bytes of it) there is NOTHING else in this program. Posting the code would be meaningless because it's example code from the SDK itself.
I'm not sure what alignment means, I read the thread you link to above but it sheds no light on my situation because I have no pointers to anything in my crash sample code and this behavior is happening on both of the phones on which I'm testing (T-720 and LG4500).
The thing is that there is nothing even remotely tricky or kludgy in the code. I'm not messing round with bit manipulation or accessing registers, so there is no "hidden" place for this to be happening. It's right out in the open, in the midst of totally clean code (of which there is very very little).
So, reluctantly, I'm blaming the GNUDE compiler, at least until I hear from someone that they did the exact same thing with their GNUDE and it worked. Then I'll just have to kill myself.

jmiller,
optimization is turned off.
Variable type is not relevant (though if you start at the top of this thead you will note that the crash goes away when the divisor is changed from a variable to a hard coded number (i.e. 7).
Aside from freeing the memory I am mallocing (all 10 bytes of it) there is NOTHING else in this program. Posting the code would be meaningless because it's example code from the SDK itself.
I'm not sure what alignment means, I read the thread you link to above but it sheds no light on my situation because I have no pointers to anything in my crash sample code and this behavior is happening on both of the phones on which I'm testing (T-720 and LG4500).
The thing is that there is nothing even remotely tricky or kludgy in the code. I'm not messing round with bit manipulation or accessing registers, so there is no "hidden" place for this to be happening. It's right out in the open, in the midst of totally clean code (of which there is very very little).
So, reluctantly, I'm blaming the GNUDE compiler, at least until I hear from someone that they did the exact same thing with their GNUDE and it worked. Then I'll just have to kill myself.

Dr.Dre'del wrote:Aside from freeing the memory I am mallocing (all 10 bytes of it) there is NOTHING else in this program. Posting the code would be meaningless because it's example code from the SDK itself.
Could you post the ELF and MOD files? I'd like to take a look at those.

Dr.Dre'del wrote:Aside from freeing the memory I am mallocing (all 10 bytes of it) there is NOTHING else in this program. Posting the code would be meaningless because it's example code from the SDK itself.
Could you post the ELF and MOD files? I'd like to take a look at those.

Dr.Dre'del wrote:jmiller,
optimization is turned off.
Variable type is not relevant (though if you start at the top of this thead you will note that the crash goes away when the divisor is changed from a variable to a hard coded number (i.e. 7).
It can be very relevant where alignment issues exist.
Dr.Dre'del wrote:
Aside from freeing the memory I am mallocing (all 10 bytes of it) there is NOTHING else in this program. Posting the code would be meaningless because it's example code from the SDK itself.
To duplicate your problem, there needs to be a base to work from.
If you post this base, regardless of where it came from, then someone else
can duplicate the behavior and drill down to find the bug (even if it is
in the compiler).
I can't tell you how many times someone was SO SURE there was nothing
odd in the code, and then it turned out that there was. Why beat your
head against the wall?
Seeing '10 bytes' makes me suspicious of alignment issues.
Dr.Dre'del wrote:
I'm not sure what alignment means, I read the thread you link to above but it sheds no light on my situation because I have no pointers to anything in my crash sample code and this behavior is happening on both of the phones on which I'm testing (T-720 and LG4500).
You are allocating some memory but you have no pointers? What is the
result of the MALLOC being saved to? Are you casting? (again alignment)
Lets see the line.
Dr.Dre'del wrote:
The thing is that there is nothing even remotely tricky or kludgy in the code. I'm not messing round with bit manipulation or accessing registers, so there is no "hidden" place for this to be happening. It's right out in the open, in the midst of totally clean code (of which there is very very little).
So, reluctantly, I'm blaming the GNUDE compiler, at least until I hear from someone that they did the exact same thing with their GNUDE and it worked. Then I'll just have to kill myself.
Don't kill yourself. Post the code. Someone will find the problem. Post the C file, and header file(s) that are yours, the make, etc. Everything needed to build it.
Did you inspect the ASM?

Dr.Dre'del wrote:jmiller,
optimization is turned off.
Variable type is not relevant (though if you start at the top of this thead you will note that the crash goes away when the divisor is changed from a variable to a hard coded number (i.e. 7).
It can be very relevant where alignment issues exist.
Dr.Dre'del wrote:
Aside from freeing the memory I am mallocing (all 10 bytes of it) there is NOTHING else in this program. Posting the code would be meaningless because it's example code from the SDK itself.
To duplicate your problem, there needs to be a base to work from.
If you post this base, regardless of where it came from, then someone else
can duplicate the behavior and drill down to find the bug (even if it is
in the compiler).
I can't tell you how many times someone was SO SURE there was nothing
odd in the code, and then it turned out that there was. Why beat your
head against the wall?
Seeing '10 bytes' makes me suspicious of alignment issues.
Dr.Dre'del wrote:
I'm not sure what alignment means, I read the thread you link to above but it sheds no light on my situation because I have no pointers to anything in my crash sample code and this behavior is happening on both of the phones on which I'm testing (T-720 and LG4500).
You are allocating some memory but you have no pointers? What is the
result of the MALLOC being saved to? Are you casting? (again alignment)
Lets see the line.
Dr.Dre'del wrote:
The thing is that there is nothing even remotely tricky or kludgy in the code. I'm not messing round with bit manipulation or accessing registers, so there is no "hidden" place for this to be happening. It's right out in the open, in the midst of totally clean code (of which there is very very little).
So, reluctantly, I'm blaming the GNUDE compiler, at least until I hear from someone that they did the exact same thing with their GNUDE and it worked. Then I'll just have to kill myself.
Don't kill yourself. Post the code. Someone will find the problem. Post the C file, and header file(s) that are yours, the make, etc. Everything needed to build it.
Did you inspect the ASM?

ditto, we'll be a lot more willing to help you track it down, with a test project.
you might think your code is 100% correct and we could spend an hour or more, trying to recreate your situation with code thats actually correct, whereas with the code you are using it could be as simple as using lint or examing the generated code, determing if its end user or compiler fault etc.
i'm not sure i understand your reluctance to post something when you want people to help you out though, we're all busy too.
charlie

ditto, we'll be a lot more willing to help you track it down, with a test project.
you might think your code is 100% correct and we could spend an hour or more, trying to recreate your situation with code thats actually correct, whereas with the code you are using it could be as simple as using lint or examing the generated code, determing if its end user or compiler fault etc.
i'm not sure i understand your reluctance to post something when you want people to help you out though, we're all busy too.
charlie

no no... no reluctance... I am more than happy to post the code... I totally understand all your points.
Yes, indeed there is the one pointer for the malloc (it's a char * as you will see in the posted code) but regarding the 10 bytes, the reason I mentioned it is that I'm not using a lot of memory, I'm just grabbing 10 bytes for a pointer that never actually gets used.
Anyway... I'll post everything in a few minutes and as always I appreciate all your help.

no no... no reluctance... I am more than happy to post the code... I totally understand all your points.
Yes, indeed there is the one pointer for the malloc (it's a char * as you will see in the posted code) but regarding the 10 bytes, the reason I mentioned it is that I'm not using a lot of memory, I'm just grabbing 10 bytes for a pointer that never actually gets used.
Anyway... I'll post everything in a few minutes and as always I appreciate all your help.

ok... I found a caveat that may clarify things slightly (though not for me).
The T-720 crashes the first time I ask it to run a modulous op. However, the LG4500 is willing to not crash (though it returns a bizarre and incorrect result). Then, after I do a MALLOC, and attempt another % op, that's when the LG4500 crashes.
I'm posting the C file, the mod file and the make file.
I am not using any other header files (other than the stuff that comes with the SDK). This is all being built to 1.1 spec and works as expected until the modulous ops start to show up.
I'm using a IDISPLAY_DrawText() because on the LG4500 DBGPRINTF takes over a minute to print (and sometimes never does). However, I promise you (cross my heart and hope not to die) that when you use DBGPRINTF instead of IDISPLAY_DrawText, nothing changes.
edit 1. Oh, and I changed my bytes to uint16s just to shake things up a bit. It doesn't matter though, the variable type has no effect on the problem.

ok... I found a caveat that may clarify things slightly (though not for me).
The T-720 crashes the first time I ask it to run a modulous op. However, the LG4500 is willing to not crash (though it returns a bizarre and incorrect result). Then, after I do a MALLOC, and attempt another % op, that's when the LG4500 crashes.
I'm posting the C file, the mod file and the make file.
I am not using any other header files (other than the stuff that comes with the SDK). This is all being built to 1.1 spec and works as expected until the modulous ops start to show up.
I'm using a IDISPLAY_DrawText() because on the LG4500 DBGPRINTF takes over a minute to print (and sometimes never does). However, I promise you (cross my heart and hope not to die) that when you use DBGPRINTF instead of IDISPLAY_DrawText, nothing changes.
edit 1. Oh, and I changed my bytes to uint16s just to shake things up a bit. It doesn't matter though, the variable type has no effect on the problem.

Dr.Dre'del wrote:ok... I found a caveat that may clarify things slightly (though not for me).
The T-720 crashes the first time I ask it to run a modulous op. However, the LG4500 is willing to not crash (though it returns a bizarre and incorrect result). Then, after I do a MALLOC, and attempt another % op, that's when the LG4500 crashes.
I'm posting the C file, the mod file and the make file.
I am not using any other header files (other than the stuff that comes with the SDK). This is all being built to 1.1 spec and works as expected until the modulous ops start to show up.
I'm using a IDISPLAY_DrawText() because on the LG4500 DBGPRINTF takes over a minute to print (and sometimes never does). However, I promise you (cross my heart and hope not to die) that when you use DBGPRINTF instead of IDISPLAY_DrawText, nothing changes.
edit 1. Oh, and I changed my bytes to uint16s just to shake things up a bit. It doesn't matter though, the variable type has no effect on the problem.
Almost there....need your bid and your MIF file.

Dr.Dre'del wrote:ok... I found a caveat that may clarify things slightly (though not for me).
The T-720 crashes the first time I ask it to run a modulous op. However, the LG4500 is willing to not crash (though it returns a bizarre and incorrect result). Then, after I do a MALLOC, and attempt another % op, that's when the LG4500 crashes.
I'm posting the C file, the mod file and the make file.
I am not using any other header files (other than the stuff that comes with the SDK). This is all being built to 1.1 spec and works as expected until the modulous ops start to show up.
I'm using a IDISPLAY_DrawText() because on the LG4500 DBGPRINTF takes over a minute to print (and sometimes never does). However, I promise you (cross my heart and hope not to die) that when you use DBGPRINTF instead of IDISPLAY_DrawText, nothing changes.
edit 1. Oh, and I changed my bytes to uint16s just to shake things up a bit. It doesn't matter though, the variable type has no effect on the problem.
Almost there....need your bid and your MIF file.

ok, the above zip file has been edited to include the mif and bid files.
have fun :)

ok, the above zip file has been edited to include the mif and bid files.
have fun :)

I just tried your mod on a T720 with "S/W Version 01.02.22.17p"..
and it doesnt crash, it displays:
45 mod 8 = 5
45 mod 8 = 1
45 mod 8 = 1
incedentaly, if i try and recreate your mod file, i am unable to generate the mod file...
i get the following errors:
D:\gnude/bin/arm-elf-ld: Warning: D:\gnude/lib/gcc-lib/arm-elf/3.3.1/thumb/libgcc.a(_umodsi3.o) does not support interworking, whereas HELLOWORLD.elf does
D:\gnude/bin/arm-elf-ld: Warning: D:\gnude/lib/gcc-lib/arm-elf/3.3.1/thumb/libgcc.a(_dvmd_tls.o) does not support interworking, whereas HELLOWORLD.elf does
D:\gnude/bin/arm-elf-ld: D:\gnude/lib/gcc-lib/arm-elf/3.3.1/thumb/libgcc.a(_umodsi3.o)(__umodsi3): warning: interworking not enabled.
D:\gnude/bin/arm-elf-ld: first occurrence: HELLOWORLD.o: arm call to thumb
D:\gnude/BREWelf2mod.exe HELLOWORLD.elf HELLOWORLD.mod
Unknown type "R_ARM_THM_PC22"
make: *** [HELLOWORLD.mod] Error 1
-Tyndal

I just tried your mod on a T720 with "S/W Version 01.02.22.17p"..
and it doesnt crash, it displays:
45 mod 8 = 5
45 mod 8 = 1
45 mod 8 = 1
incedentaly, if i try and recreate your mod file, i am unable to generate the mod file...
i get the following errors:
D:\gnude/bin/arm-elf-ld: Warning: D:\gnude/lib/gcc-lib/arm-elf/3.3.1/thumb/libgcc.a(_umodsi3.o) does not support interworking, whereas HELLOWORLD.elf does
D:\gnude/bin/arm-elf-ld: Warning: D:\gnude/lib/gcc-lib/arm-elf/3.3.1/thumb/libgcc.a(_dvmd_tls.o) does not support interworking, whereas HELLOWORLD.elf does
D:\gnude/bin/arm-elf-ld: D:\gnude/lib/gcc-lib/arm-elf/3.3.1/thumb/libgcc.a(_umodsi3.o)(__umodsi3): warning: interworking not enabled.
D:\gnude/bin/arm-elf-ld: first occurrence: HELLOWORLD.o: arm call to thumb
D:\gnude/BREWelf2mod.exe HELLOWORLD.elf HELLOWORLD.mod
Unknown type "R_ARM_THM_PC22"
make: *** [HELLOWORLD.mod] Error 1
-Tyndal

Dr.Dre'del wrote:ok, the above zip file has been edited to include the mif and bid files.
have fun :)
(From your make file)
This seems wrong:
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1/thumb
Looks like you are linking with thumb code but not
compiling for thumb (which I dont think is supported anyway).
Code works fine on my T720 compiled with my libs set to
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/2.95.3
---jeff

Dr.Dre'del wrote:ok, the above zip file has been edited to include the mif and bid files.
have fun :)
(From your make file)
This seems wrong:
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1/thumb
Looks like you are linking with thumb code but not
compiling for thumb (which I dont think is supported anyway).
Code works fine on my T720 compiled with my libs set to
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/2.95.3
---jeff

ok, I'm totally confused.
tyndal's T-720 doesn't crash though it clearly isn't doing the math right (note that 45 % 8 should never EVER (not even with Bush as president) be 1!)
You guys are getting different results, how is that possible?
If you guys look at the code you will observe (as I have been saying) that there is nothing in it that should cause any problems whatsoever. tyndal, do you have the cygwin compiler to test this on... if my calculations are correct (diabolical laughter here) it will compile and run correctly when gnude is taken out of the equation.
I'm not sure what to make of the compiler error, I'm no makefile wizard, but the files I put in that zip are what I used to generate the mod, so, where should I go looking for the compilation error? (especially since I'm not getting one).
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/2.95.3
won't work for me because I don't have a 2.95.3 directory (probably owing to the fact that I have a more recent version).
I suppose I can try to download the older version and see if that helps (since I'm sitting around blaming gnude, it may be the obvious thing to try).
So.. tyndal... what do you have that I don't have (or vice versa) that's allowing me to compile this thing without a hitch, but throwing and error for you?
edit 1. for starters I'm running S/W 1.2.22.44.AP
I think the thing to focus on is that in spite of the fact that your phone isn't crashing, it's still printing 1 in both places where it is asked to modulous two variables (however, it does the math right in the first instance where the divisor is a hard coded 5). I'm SO happy to see that I'm not crazy after all! WHEEEEEE.

ok, I'm totally confused.
tyndal's T-720 doesn't crash though it clearly isn't doing the math right (note that 45 % 8 should never EVER (not even with Bush as president) be 1!)
You guys are getting different results, how is that possible?
If you guys look at the code you will observe (as I have been saying) that there is nothing in it that should cause any problems whatsoever. tyndal, do you have the cygwin compiler to test this on... if my calculations are correct (diabolical laughter here) it will compile and run correctly when gnude is taken out of the equation.
I'm not sure what to make of the compiler error, I'm no makefile wizard, but the files I put in that zip are what I used to generate the mod, so, where should I go looking for the compilation error? (especially since I'm not getting one).
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/2.95.3
won't work for me because I don't have a 2.95.3 directory (probably owing to the fact that I have a more recent version).
I suppose I can try to download the older version and see if that helps (since I'm sitting around blaming gnude, it may be the obvious thing to try).
So.. tyndal... what do you have that I don't have (or vice versa) that's allowing me to compile this thing without a hitch, but throwing and error for you?
edit 1. for starters I'm running S/W 1.2.22.44.AP
I think the thing to focus on is that in spite of the fact that your phone isn't crashing, it's still printing 1 in both places where it is asked to modulous two variables (however, it does the math right in the first instance where the divisor is a hard coded 5). I'm SO happy to see that I'm not crazy after all! WHEEEEEE.

Dr.Dre'del wrote:
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/2.95.3
won't work for me because I don't have a 2.95.3 directory (probably owing to the fact that I have a more recent version).
then you should have a
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1/
directory (non thumb) that was built when you built the compiler.
BY the way, my output is:
45 mod 8 = 5
45 mod 8 = 5
45 mod 8 = 5

Dr.Dre'del wrote:
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/2.95.3
won't work for me because I don't have a 2.95.3 directory (probably owing to the fact that I have a more recent version).
then you should have a
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1/
directory (non thumb) that was built when you built the compiler.
BY the way, my output is:
45 mod 8 = 5
45 mod 8 = 5
45 mod 8 = 5

Dr.Dre'del wrote:ok, I'm totally confused.
tyndal's T-720 doesn't crash though it clearly isn't doing the math right (note that 45 % 8 should never EVER (not even with Bush as president) be 1!)
You guys are getting different results, how is that possible?
If you guys look at the code you will observe (as I have been saying) that there is nothing in it that should cause any problems whatsoever. tyndal, do you have the cygwin compiler to test this on... if my calculations are correct (diabolical laughter here) it will compile and run correctly when gnude is taken out of the equation.
well, i just used the mod you provided, it looks jmiller actually recompiled it, and i actually get the same result as him if i recompile.. i had to remove the "thumb" part of your lib dir, and compile with my gnude environment (that was the only change I made to your gccMakefile):
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1
By "cygwin compiler" do you mean the arm compiler? yes i have it and it works fine as well..
by works fine, i mean it doesnt crash, AND all three output strings are "45 mod 8 = 5"
Quote:
I'm not sure what to make of the compiler error, I'm no makefile wizard, but the files I put in that zip are what I used to generate the mod, so, where should I go looking for the compilation error? (especially since I'm not getting one).
if you are using my buildme.bat stuff, the output goes in errorLog.txt, look there for errors.
Quote:
So.. tyndal... what do you have that I don't have (or vice versa) that's allowing me to compile this thing without a hitch, but throwing and error for you?
edit 1. for starters I'm running S/W 1.2.22.44.AP
I think the thing to focus on is that in spite of the fact that your phone isn't crashing, it's still printing 1 in both places where it is asked to modulous two variables (however, it does the math right in the first instance where the divisor is a hard coded 5). I'm SO happy to see that I'm not crazy after all! WHEEEEEE.
so, it looks like the problem might be the attempt to thumb compile .. which as jmiller said, i dont think is supported with gcc..
http://brewforums.qualcomm.com/showthread.php?t=4703&highlight=thumb
http://brewforums.qualcomm.com/showthread.php?t=5004&highlight=thumb
-Tyndal

Dr.Dre'del wrote:ok, I'm totally confused.
tyndal's T-720 doesn't crash though it clearly isn't doing the math right (note that 45 % 8 should never EVER (not even with Bush as president) be 1!)
You guys are getting different results, how is that possible?
If you guys look at the code you will observe (as I have been saying) that there is nothing in it that should cause any problems whatsoever. tyndal, do you have the cygwin compiler to test this on... if my calculations are correct (diabolical laughter here) it will compile and run correctly when gnude is taken out of the equation.
well, i just used the mod you provided, it looks jmiller actually recompiled it, and i actually get the same result as him if i recompile.. i had to remove the "thumb" part of your lib dir, and compile with my gnude environment (that was the only change I made to your gccMakefile):
LIBDIRS = -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1
By "cygwin compiler" do you mean the arm compiler? yes i have it and it works fine as well..
by works fine, i mean it doesnt crash, AND all three output strings are "45 mod 8 = 5"
Quote:
I'm not sure what to make of the compiler error, I'm no makefile wizard, but the files I put in that zip are what I used to generate the mod, so, where should I go looking for the compilation error? (especially since I'm not getting one).
if you are using my buildme.bat stuff, the output goes in errorLog.txt, look there for errors.
Quote:
So.. tyndal... what do you have that I don't have (or vice versa) that's allowing me to compile this thing without a hitch, but throwing and error for you?
edit 1. for starters I'm running S/W 1.2.22.44.AP
I think the thing to focus on is that in spite of the fact that your phone isn't crashing, it's still printing 1 in both places where it is asked to modulous two variables (however, it does the math right in the first instance where the divisor is a hard coded 5). I'm SO happy to see that I'm not crazy after all! WHEEEEEE.
so, it looks like the problem might be the attempt to thumb compile .. which as jmiller said, i dont think is supported with gcc..
http://brewforums.qualcomm.com/showthread.php?t=4703&highlight=thumb
http://brewforums.qualcomm.com/showthread.php?t=5004&highlight=thumb
-Tyndal

Ok, I know I'm not the sharpest knife in the drawer, but I feel like I should be understanding you and I'm not...
So, I'm going to ask you to treat me like an idiot (I promise not to get offended) and be a bit more verbose.
Specifically...
I never compiled a compiler because I'm using gnude, so, GCCHOMEPATH in this case is C:\gnude
I have a directory (and line in the make file) pointing to -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1/thumb
are you saying I should remove the /thumb part? or are you saying that by virtue of not having a 2.95.3 directory I am lacking files essential to a proper compilation? or am I totally missing the point?
edit 1. tyndal, sorry, I wrote this as you were typing your post. I'll try without the /thumb part and see if that helps.
btw, I am using the buildme.bat stuff and I included the (error free) error file in the zip... which is why I was extra upset by the fact that using the same set of files you got an error and I did not.
As for the cygwin, I did mean the arm compiler and if what you say is true (resulting file creates a clean build WITHOUT modulous equation errors) then the problem HAS to be GNUDE, no?

Ok, I know I'm not the sharpest knife in the drawer, but I feel like I should be understanding you and I'm not...
So, I'm going to ask you to treat me like an idiot (I promise not to get offended) and be a bit more verbose.
Specifically...
I never compiled a compiler because I'm using gnude, so, GCCHOMEPATH in this case is C:\gnude
I have a directory (and line in the make file) pointing to -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1/thumb
are you saying I should remove the /thumb part? or are you saying that by virtue of not having a 2.95.3 directory I am lacking files essential to a proper compilation? or am I totally missing the point?
edit 1. tyndal, sorry, I wrote this as you were typing your post. I'll try without the /thumb part and see if that helps.
btw, I am using the buildme.bat stuff and I included the (error free) error file in the zip... which is why I was extra upset by the fact that using the same set of files you got an error and I did not.
As for the cygwin, I did mean the arm compiler and if what you say is true (resulting file creates a clean build WITHOUT modulous equation errors) then the problem HAS to be GNUDE, no?

Dr.Dre'del wrote:Ok, I'm know I'm not the sharpest knife in the drawer, but I feel like I should be understanding you and I'm not...
I have a directory (and line in the make file) pointing to -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1/thumb
are you saying I should remove the /thumb part? or are you saying that by virtue of not having a 2.95.3 directory I am lacking files essential to a proper compilation? or am I totally missing the point?
If you do not have the standard directories, then it seems you
are missing the non thumb support (standard support?).
It plainly states in the qualcom supplied makefile, that
"Add $(TARG) to the CODE line if you're building a Thumb binary (at
# the moment, this doesn't work)."
Does
L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1/
exist?
If so, then I suspect if you compile with that lib directory that your
program will now run fine.
If not then you will have to consult the docs or whatever other
resources you have with your version to find that standard
library directory.
I think you have your solution. Just sprint that last leg and
don't forget to lean at the tape.

Dr.Dre'del wrote:Ok, I'm know I'm not the sharpest knife in the drawer, but I feel like I should be understanding you and I'm not...
I have a directory (and line in the make file) pointing to -L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1/thumb
are you saying I should remove the /thumb part? or are you saying that by virtue of not having a 2.95.3 directory I am lacking files essential to a proper compilation? or am I totally missing the point?
If you do not have the standard directories, then it seems you
are missing the non thumb support (standard support?).
It plainly states in the qualcom supplied makefile, that
"Add $(TARG) to the CODE line if you're building a Thumb binary (at
# the moment, this doesn't work)."
Does
L$(GCCHOMEPATH)/lib/gcc-lib/arm-elf/3.3.1/
exist?
If so, then I suspect if you compile with that lib directory that your
program will now run fine.
If not then you will have to consult the docs or whatever other
resources you have with your version to find that standard
library directory.
I think you have your solution. Just sprint that last leg and
don't forget to lean at the tape.

That was it!!! All the problems are gone and everything works!
Ok... what exactly does it mean to thumb compile?
Tyndal... I strongly urge you to add a warning to your excellent thread as to the use and configuration of gnude to make sure that people don't make this mistake.
The comment in the makefile is this...
#-----------------------------------------------------------------------
# Library search path options. It points the location of libgcc.a
#-----------------------------------------------------------------------
both the 3.3.1 and the 3.3.1/thumb directories have this file!
Anyway... thank you both SOOOOO much!
how wierd though that the application (my actual app, not this little test file) which is fairly complex and thousands of lines spread across 10 someodd files works perfectly with the exception of this modulous (and divide) problem!
again... what is this Thumb compilation that it would be so similar (but so evil)?

That was it!!! All the problems are gone and everything works!
Ok... what exactly does it mean to thumb compile?
Tyndal... I strongly urge you to add a warning to your excellent thread as to the use and configuration of gnude to make sure that people don't make this mistake.
The comment in the makefile is this...
#-----------------------------------------------------------------------
# Library search path options. It points the location of libgcc.a
#-----------------------------------------------------------------------
both the 3.3.1 and the 3.3.1/thumb directories have this file!
Anyway... thank you both SOOOOO much!
how wierd though that the application (my actual app, not this little test file) which is fairly complex and thousands of lines spread across 10 someodd files works perfectly with the exception of this modulous (and divide) problem!
again... what is this Thumb compilation that it would be so similar (but so evil)?

Dr.Dre'del wrote:
As for the cygwin, I did mean the arm compiler and if what you say is true (resulting file creates a clean build WITHOUT modulous equation errors) then the problem HAS to be GNUDE, no?
basically, i would be surprized if removing the "thumb" part of the directory doesnt solve your problem..
edit:
i just noticed that the BrewElf2Mod tool was updated recently
Quote:
BREW(TM) Support for GNU Cross Compiler for Thumb mode and big endian support version 1.0.1.1
March 18, 2004
of course, there doesnt seem to be a way to check the version of BrewElf2Mod, but i know i have the old version, and it is probably why i'm gttting an 'Unknown type "R_ARM_THM_PC22"' error.
So, you probably want to notify brew support that at least in this case, thumb mode doesnt work, or you have to do something different for thumb mode to work, or perhaps thumb mode just doesnt work with the gnude distrbution (maybe build your own gcc cross-compiler ;) )
-Tyndal

Dr.Dre'del wrote:
As for the cygwin, I did mean the arm compiler and if what you say is true (resulting file creates a clean build WITHOUT modulous equation errors) then the problem HAS to be GNUDE, no?
basically, i would be surprized if removing the "thumb" part of the directory doesnt solve your problem..
edit:
i just noticed that the BrewElf2Mod tool was updated recently
Quote:
BREW(TM) Support for GNU Cross Compiler for Thumb mode and big endian support version 1.0.1.1
March 18, 2004
of course, there doesnt seem to be a way to check the version of BrewElf2Mod, but i know i have the old version, and it is probably why i'm gttting an 'Unknown type "R_ARM_THM_PC22"' error.
So, you probably want to notify brew support that at least in this case, thumb mode doesnt work, or you have to do something different for thumb mode to work, or perhaps thumb mode just doesnt work with the gnude distrbution (maybe build your own gcc cross-compiler ;) )
-Tyndal

Dr.Dre'del wrote:That was it!!! All the problems are gone and everything works!
Ok... what exactly does it mean to thumb compile?
good, glad it worked..
basically i think thumb mode switches from 32 to 16 bit mode,
http://www.nohau.com/appnotes/arm-thumb.pdf
-Tyndal
edit:
the problem is most likely due to code optimization.. allocating 10 bytes doesnt fit evenly into 16 or 32bits, but is probably handled differently in thumb and non-thumb mode.. also, compiler optimization of a hard coded number versus a variable makes a difference
I'm not all that familiar with this stuff, just a guess as to what is going on ;)

Dr.Dre'del wrote:That was it!!! All the problems are gone and everything works!
Ok... what exactly does it mean to thumb compile?
good, glad it worked..
basically i think thumb mode switches from 32 to 16 bit mode,
http://www.nohau.com/appnotes/arm-thumb.pdf
-Tyndal
edit:
the problem is most likely due to code optimization.. allocating 10 bytes doesnt fit evenly into 16 or 32bits, but is probably handled differently in thumb and non-thumb mode.. also, compiler optimization of a hard coded number versus a variable makes a difference
I'm not all that familiar with this stuff, just a guess as to what is going on ;)

I can't thank you enough... I would glaldy send you a box of chocolates if I knew where to send them to.
As for the 10 bytes, the malloc only affected the LG4500. The T-720 (for me) just crashed, and for you gave the wrong result and would have continued to do so with or without the malloc. If not for the LG4500 crashing after the malloc I would have left it out altogether. I wonder what else (if anything) was being affected by this attempted thumb compilation. I didn't notice any side effects... very odd.

I can't thank you enough... I would glaldy send you a box of chocolates if I knew where to send them to.
As for the 10 bytes, the malloc only affected the LG4500. The T-720 (for me) just crashed, and for you gave the wrong result and would have continued to do so with or without the malloc. If not for the LG4500 crashing after the malloc I would have left it out altogether. I wonder what else (if anything) was being affected by this attempted thumb compilation. I didn't notice any side effects... very odd.

Just out of curiosity, are the mod sizes different with & without the "/thumb" for your apps? im wondering if you really compiled for thumb or not, you might have just been linking with thumb.. if you actually compiled for thumb, it should be a smaller binary (than non-thumb).. and the arm compiler is better than the gcc one in either case.
arm compiler: (non thumb i think) 2036 bytes
my gnude build :4,512 bytes
your original gnude build: 4,500bytes
.. i think it would be a bigger difference if it was actually thumb mode..
-Tyndal

Just out of curiosity, are the mod sizes different with & without the "/thumb" for your apps? im wondering if you really compiled for thumb or not, you might have just been linking with thumb.. if you actually compiled for thumb, it should be a smaller binary (than non-thumb).. and the arm compiler is better than the gcc one in either case.
arm compiler: (non thumb i think) 2036 bytes
my gnude build :4,512 bytes
your original gnude build: 4,500bytes
.. i think it would be a bigger difference if it was actually thumb mode..
-Tyndal

4528(for thumb) vs 4512(for non thumb)
what does that tell you?

4528(for thumb) vs 4512(for non thumb)
what does that tell you?

tyndal wrote:Just out of curiosity, are the mod sizes different with & without the "/thumb" for your apps? im wondering if you really compiled for thumb or not, you might have just been linking with thumb.. if you actually compiled for thumb, it should be a smaller binary (than non-thumb).. and the arm compiler is better than the gcc one in either case.
arm compiler: (non thumb i think) 2036 bytes
my gnude build :4,512 bytes
your original gnude build: 4,500bytes
.. i think it would be a bigger difference if it was actually thumb mode..
-Tyndal
I suspect that is what he did.
Thumb is 16 bit mode.
http://66.102.7.104/search?q=cache:elhQAe-6LEIJ:www.nohau.com/appnotes/a...
Time to solve bug: 1hour
Charge : 0 (this time) heh heh....
---jeff

tyndal wrote:Just out of curiosity, are the mod sizes different with & without the "/thumb" for your apps? im wondering if you really compiled for thumb or not, you might have just been linking with thumb.. if you actually compiled for thumb, it should be a smaller binary (than non-thumb).. and the arm compiler is better than the gcc one in either case.
arm compiler: (non thumb i think) 2036 bytes
my gnude build :4,512 bytes
your original gnude build: 4,500bytes
.. i think it would be a bigger difference if it was actually thumb mode..
-Tyndal
I suspect that is what he did.
Thumb is 16 bit mode.
http://66.102.7.104/search?q=cache:elhQAe-6LEIJ:www.nohau.com/appnotes/a...
Time to solve bug: 1hour
Charge : 0 (this time) heh heh....
---jeff

FWIW,
GCC has always supported thumb, at least since version 2.95. I was able to create THUMB code for the GBA using devkit advance. And you almost had to generate THUMB for the GBA due to hardware limitations.
The THUMB mode of the ARM CPU is a 16 bit mode. The instructions are 16 bits wide instead of 32 bits, half the registers are not as readily available for use while in 16-bit mode, and the instructions are a bit more limited than in ARM (32-bit instruction) mode. The 16-bit mode is certainly faster when there's a 16-bit memory system, though the CPU does have 32K of 32-bit wide memory on-board that is very fast to access.
ARM (32-bit) and THUMB (16-bit) code can trivially co-exist in the same program. The CPU has instructions for calling from code in one CPU mode to code in the other and returning properly. If I remember correctly, GCC (or LD) has "interwork" command line switches that you have to use if you want to mix ARM compiled .o's with THUMB compiled .o's. The "interwork" switches adds the required code to switch between CPU modes appropriately.
I observed that the same programs compiled in thumb are about 2/3 the size of the ARM compiled versions. But since you may need more THUMB instructions to accomplish the same operation as a fewer number of ARM instructions, you get slower code in THUMB mode.
If the 90-10 rule is true for your program, you can probably compile 90% of your subroutines in THUMB mode to save space and the other 10% in ARM mode to gain the needed speed.
The ARM compiler is much better than GCC, as the code it generates is smaller and up to 30% faster.
Interestingly, the ARM9 has THREE CPU modes. THUMB, ARM, and a third mode that executes JAVA byte codes.
Cheers

FWIW,
GCC has always supported thumb, at least since version 2.95. I was able to create THUMB code for the GBA using devkit advance. And you almost had to generate THUMB for the GBA due to hardware limitations.
The THUMB mode of the ARM CPU is a 16 bit mode. The instructions are 16 bits wide instead of 32 bits, half the registers are not as readily available for use while in 16-bit mode, and the instructions are a bit more limited than in ARM (32-bit instruction) mode. The 16-bit mode is certainly faster when there's a 16-bit memory system, though the CPU does have 32K of 32-bit wide memory on-board that is very fast to access.
ARM (32-bit) and THUMB (16-bit) code can trivially co-exist in the same program. The CPU has instructions for calling from code in one CPU mode to code in the other and returning properly. If I remember correctly, GCC (or LD) has "interwork" command line switches that you have to use if you want to mix ARM compiled .o's with THUMB compiled .o's. The "interwork" switches adds the required code to switch between CPU modes appropriately.
I observed that the same programs compiled in thumb are about 2/3 the size of the ARM compiled versions. But since you may need more THUMB instructions to accomplish the same operation as a fewer number of ARM instructions, you get slower code in THUMB mode.
If the 90-10 rule is true for your program, you can probably compile 90% of your subroutines in THUMB mode to save space and the other 10% in ARM mode to gain the needed speed.
The ARM compiler is much better than GCC, as the code it generates is smaller and up to 30% faster.
Interestingly, the ARM9 has THREE CPU modes. THUMB, ARM, and a third mode that executes JAVA byte codes.
Cheers

i'd imagine it didn't work as it'd be linking in the div routines from a non interworked library, it should warn you though.

i'd imagine it didn't work as it'd be linking in the div routines from a non interworked library, it should warn you though.