Google Mail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
May your privileged stack always be big enough
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  25 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Follow-up To:
Add Cc | Add Follow-up to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers that you hear
 
James Harris  
View profile   Translate to Translated (View Original)
 More options 10 Oct, 12:52
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Sat, 10 Oct 2009 04:52:16 -0700 (PDT)
Local: Sat 10 Oct 2009 12:52
Subject: May your privileged stack always be big enough
I normally list the key points of a post in the subject heading but in
this case there are just too many.... The post is about detecting
application stack overflow and underflow and, in particular,
protecting and sizing the privileged stack in 32-bit and 64-bit modes.

I'd appreciate your thoughts, suggestions and corrections.

I'm looking at the base Intel and AMD 64-bit architecture (which I'll
call x86-64 herein) with a view to it influencing my 32-bit code. Why?
Well, it seems sensible to design 32-bit operations which don't
require too many changes to port to 64-bit later. I've not looked at
64-bit working before. It is quite different, isn't it!

1) In x86-64 the stack segment has base = 0 and limit = none as do
code and data segments. So it's not even an option to detect stack
overflow (a request for stack expansion) or underflow (trying to
remove more than the stack holds) by reference to the stack segment.
The only option I can think of is to have guard page frames above and
below every application (non-privileged) stack. These would be marked
not-present. Is this the best way to detect application stack overflow
and underflow?

2) The privileged stack is a critical resource, isn't it? AFAICS it
must always have present memory to write to. If, in a page fault,
there is not enough stack space we'll get a double fault. And because
double faults are not restartable there is no apparent means of
recovery. So how is it best to provide privileged stack space? Should
its size be checked at the top or bottom of some or all service
routines, or can all service routines be written to unwind it before
returning to user mode? It seems so but it would be good to hear what
you guys have done or are thinking of.

3) If the privileged stack must always be large enough how much space
should be set aside? If it is only used to service interrupts and
syscalls it probably doesn't need to be very big. A 4k page seems much
too large. The bulk of the state can be saved in a thread image if
desirable.

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile   Translate to Translated (View Original)
 More options 11 Oct, 09:43
Newsgroups: alt.os.development
From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
Date: Sun, 11 Oct 2009 04:43:26 -0400
Local: Sun 11 Oct 2009 09:43
Subject: Re: May your privileged stack always be big enough
"James Harris" <james.harri...@googlemail.com> wrote in message

news:72fd8b50-eaad-4764-8ff4-9ad64b530d5c@w19g2000yqk.googlegroups.com...

> I normally list the key points of a post in the subject heading but in
> this case there are just too many.... The post is about detecting
> application stack overflow and underflow and, in particular,
> protecting and sizing the privileged stack in 32-bit and 64-bit modes.

> I'd appreciate your thoughts, suggestions and corrections.

> I'm looking at the base Intel and AMD 64-bit architecture (which I'll
> call x86-64 herein) with a view to it influencing my 32-bit code. Why?
> Well, it seems sensible to design 32-bit operations which don't
> require too many changes to port to 64-bit later.

Sensible.  I was looking at interpreters to solve my "Just how do I get this
stuff to work on 64-bit if my compilers are only 32-bit?" problem.

> I've not looked at
> 64-bit working before.

I haven't looked.  Sorry, but BGB/cr88192 seems to be the only one
discussing x86-64 stuff lately...  And, it seems to me that posting levels
have fallen dramatically in the past few years.  Some of that is probably
due to major ISP's dropping free Usenet.

> It is quite different, isn't it!

I wouldn't know.  Ok, I know a bit now...

> 1) In x86-64 the stack segment has base = 0 and limit = none as do
> code and data segments.

Uh...  You'll have to explain that for me.  Delete. Delete. Ok, nevermind.
I had to read a bit of the manual.  Yes, it seems that 64-bit mode ignores
segment base on SS, DS, ES and ignores the limit on all.

> So it's not even an option to detect stack
> overflow (a request for stack expansion) or underflow (trying to
> remove more than the stack holds) by reference to the stack segment.

"The preferred method of implementing memory protection in a long-mode
operating system is to rely on the page-protection mechanism..." - Sect. 4.9
Vol 2 AMD64 Arch. Progr. Man. 2007

Answer?

> The only option I can think of is to have guard page frames above and
> below every application (non-privileged) stack. These would be marked
> not-present. Is this the best way to detect application stack overflow
> and underflow?

Don't know.   Ok, read some, it seems there is no limit checking in 64-bit
mode for SS.  Although, there is apparently "canonical" addressing or
sign-extension on the upper 20 bits that will generate a SS# if not all
zero's or all one's.  However, that allows a large 48-bit address, "up high"
or "down low" when sign-extended to 64-bits, but only if in the address is
mapped into the page tables...  But, if it's mapped into the tables, then
you have access... Yes? No?  Sigh, do I now I need to know how privilege and
rings work in 64-bit mode to answer that question?

> 2) The privileged stack is a critical resource, isn't it? AFAICS it
> must always have present memory to write to. If, in a page fault,
> there is not enough stack space we'll get a double fault. And because
> double faults are not restartable there is no apparent means of
> recovery. So how is it best to provide privileged stack space? Should
> its size be checked at the top or bottom of some or all service
> routines, or can all service routines be written to unwind it before
> returning to user mode? It seems so but it would be good to hear what
> you guys have done or are thinking of.

I actually haven't dealt with this issue at all.  My (stalled) OS is 32-bit.
It currently starts from DOS using a special TSR.  It inherits it's stack
from the DPMI host...  This ensures it's not located where the application
is!  Of course, that will have to be fixed once it's loadable via a
bootloader.

> 3) If the privileged stack must always be large enough how much space
> should be set aside? If it is only used to service interrupts and
> syscalls it probably doesn't need to be very big. A 4k page seems much
> too large. The bulk of the state can be saved in a thread image if
> desirable.

No idea.  I've run into a few similar issues in my OS, and other C programs
for that matter...  I.e., "How much do I need to do this?..." and "Can I do
this safely without knowing how much is needed?..." etc.  And, I've not come
up with any good answer other than "tweak it 'til it works..." and "take the
safest path..."

Rod Pemberton


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wolfgang Kern  
View profile   Translate to Translated (View Original)
 More options 11 Oct, 11:28
Newsgroups: alt.os.development
From: "Wolfgang Kern" <nowh...@never.at>
Date: Sun, 11 Oct 2009 12:28:42 +0200
Local: Sun 11 Oct 2009 11:28
Subject: Re: May your privileged stack always be big enough

James Harris wrote:
> I normally list the key points of a post in the subject heading but in
> this case there are just too many.... The post is about detecting
> application stack overflow and underflow and, in particular,
> protecting and sizing the privileged stack in 32-bit and 64-bit modes.
> I'd appreciate your thoughts, suggestions and corrections.
> I'm looking at the base Intel and AMD 64-bit architecture (which I'll
> call x86-64 herein) with a view to it influencing my 32-bit code. Why?
> Well, it seems sensible to design 32-bit operations which don't
> require too many changes to port to 64-bit later. I've not looked at
> 64-bit working before. It is quite different, isn't it!

Yes indeed, it's also physical a change to another CPU-type.

> 1) In x86-64 the stack segment has base = 0 and limit = none as do
> code and data segments. So it's not even an option to detect stack
> overflow (a request for stack expansion) or underflow (trying to
> remove more than the stack holds) by reference to the stack segment.
> The only option I can think of is to have guard page frames above and
> below every application (non-privileged) stack. These would be marked
> not-present. Is this the best way to detect application stack overflow
> and underflow?

I think guard-pages may do the job, OTOH it should be the job
of the compiler/programmer to never let a stack-bug happen :)
anyway a system should detect and terminate such applications.

I once played around with a stack-warning (still implemented
in my debugger) given by a coarse check during 1mS PIT-IRQ,
but this wouldn't help much on running buggy applications.

> 2) The privileged stack is a critical resource, isn't it? AFAICS it
> must always have present memory to write to. If, in a page fault,
> there is not enough stack space we'll get a double fault. And because
> double faults are not restartable there is no apparent means of
> recovery. So how is it best to provide privileged stack space? Should
> its size be checked at the top or bottom of some or all service
> routines, or can all service routines be written to unwind it before
> returning to user mode? It seems so but it would be good to hear what
> you guys have done or are thinking of.

I assume you mean the systems stack here, and of course it must
be large enough for all system internal calls and IRQ-HW-handling
(while I have IRQ-user-event handlers apart in user space).

Where to put it? I have it on top of the resident systems area,
this is a part which never will be swapped.

> 3) If the privileged stack must always be large enough how much space
> should be set aside? If it is only used to service interrupts and
> syscalls it probably doesn't need to be very big. A 4k page seems much
> too large. The bulk of the state can be saved in a thread image if
> desirable.

The required stack-space depends ...
If your OS isn't that bloated like the two big ones, then 4 KB
may be quite enough for 32-bit mode and page aligned can be a
safety help.

Even I stand upright with my 'one stack per CPU is enough', I prepared
(for 32-bit mode) 2 KB for four fixed system stacks-parts:

*512 IRQs + internal calls
 +128 never used but reserved for the above
*128 exceptions       ;needed only in case of a stack-fault
*128 debugger         ;just in case I debug alive system code
*128 intermode linker ;while RM user stack resides in 1.MB or HMA

For 64-bit modes, which I just started to design, it may need
more stack-space just due to larger element size.
__
wolfgang


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Marven Lee  
View profile   Translate to Translated (View Original)
 More options 13 Oct, 21:41
Newsgroups: alt.os.development
From: "Marven Lee" <mar...@invalid.invalid>
Date: Tue, 13 Oct 2009 21:41:13 +0100
Local: Tues 13 Oct 2009 21:41
Subject: Re: May your privileged stack always be big enough

James Harris wrote...
> 3) If the privileged stack must always be large enough how much space
> should be set aside? If it is only used to service interrupts and
> syscalls it probably doesn't need to be very big. A 4k page seems much
> too large. The bulk of the state can be saved in a thread image if
> desirable.

I think I use 4k (or maybe 8k) stacks per task in the kernel, these stacks
are used during syscalls and for kernel tasks.  I also have a single,
separate stack for handling hardware interrupts.
.
I allow nested hardware interrupts and switch to the interrupt-
stack in the assembly outer wrapper of the interrupt handler.
No switch to the interrupt stack occurs if it is already on it.
A counter keeps track of the nest level, once it returns to zero
it switches back to the task's normal kernel stack.

I think I set the interrupt stack to be about 16k,  I doubt if nested
interrupt handlers would use anywhere near that.  If I get round to
multiprocessing there would be an interrupt stack per cpu/core.

--
Marv


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Harris  
View profile   Translate to Translated (View Original)
 More options 15 Oct, 16:08
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Thu, 15 Oct 2009 08:08:45 -0700 (PDT)
Local: Thurs 15 Oct 2009 16:08
Subject: Re: May your privileged stack always be big enough
On 11 Oct, 09:43, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:

Do you mean how do you run 32-bit compilers under a 64-bit OS? I would
have thought they should still work as ordinary 32-bit apps unless
they do something very odd. The bigger issue is dealing with their
system calls.

Yes, and CS. Only FS and GS are allowed to roam free. Notice that the
stack segment cannot be expand down. Hence my plan for empty stack
frames to bound each stack.

> > So it's not even an option to detect stack
> > overflow (a request for stack expansion) or underflow (trying to
> > remove more than the stack holds) by reference to the stack segment.

> "The preferred method of implementing memory protection in a long-mode
> operating system is to rely on the page-protection mechanism..." - Sect. 4.9
> Vol 2 AMD64 Arch. Progr. Man. 2007

> Answer?

Sort of. It seems page protection is the *only* way to implement
memory protection.

As an aside, perhaps this is a case of where competition has not been
good for the consumer. Intel's design of the 80386 was brilliant. So
good, in fact, that it has stood the test of time. Speeds have
increased and some new instructions have been added but we still use
the same user-facing architecture (with binary compatibility) almost
25 years later. And while there are 64-bit options now there's no sign
that the 80386 32-bit architecture is dying out.

By contrast, the 64-bit architecture defined by AMD may have been
first to market with x86-32 compatibility. The registers are wider and
there are more of them (a good thing but not rocket science) it
doesn't seem to do anything special.

For example, rather than fixing the segments model (one of the few
things Intel's 32-bit model didn't do well) or scaling it down they
did away with it (almost) altogether. In particular, in terms of
memory protection, forcing the overlap of code and data seems a 'bad
thing.'

> > The only option I can think of is to have guard page frames above and
> > below every application (non-privileged) stack. These would be marked
> > not-present. Is this the best way to detect application stack overflow
> > and underflow?

> Don't know.   Ok, read some, it seems there is no limit checking in 64-bit
> mode for SS.  Although, there is apparently "canonical" addressing or
> sign-extension on the upper 20 bits that will generate a SS# if not all
> zero's or all one's.  However, that allows a large 48-bit address, "up high"
> or "down low" when sign-extended to 64-bits, but only if in the address is
> mapped into the page tables...  But, if it's mapped into the tables, then
> you have access... Yes? No?  Sigh, do I now I need to know how privilege and
> rings work in 64-bit mode to answer that question?

Yes, canonical addressing is good. It ensures that unused upper bits
are not places the programmer can squirrel-away extra data - and then
find the code doesn't work when implementations use more bits for
addressing.

I think privilege rings work the same way in 64-bit mode. IIRC paging
must be enabled before changing to 64-bit mode.

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Harris  
View profile   Translate to Translated (View Original)
 More options 15 Oct, 16:34
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Thu, 15 Oct 2009 08:34:19 -0700 (PDT)
Local: Thurs 15 Oct 2009 16:34
Subject: Re: May your privileged stack always be big enough
On 11 Oct, 11:28, "Wolfgang Kern" <nowh...@never.at> wrote:

> James Harris wrote:

...

> > I've not looked at
> > 64-bit working before. It is quite different, isn't it!

> Yes indeed, it's also physical a change to another CPU-type.

It's an odd mix. I'm not sure why they didn't go for different
instruction encodings since binary code will not be compatible between
32-bit and 64-bit modes. I doubt much of the decode unit has to be
shared. A more efficient encoding would have removed the need for a
REX prefix, for example, to access the upper eight registers.

> > 1) In x86-64 the stack segment has base = 0 and limit = none as do
> > code and data segments. So it's not even an option to detect stack
> > overflow (a request for stack expansion) or underflow (trying to
> > remove more than the stack holds) by reference to the stack segment.
> > The only option I can think of is to have guard page frames above and
> > below every application (non-privileged) stack. These would be marked
> > not-present. Is this the best way to detect application stack overflow
> > and underflow?

> I think guard-pages may do the job, OTOH it should be the job
> of the compiler/programmer to never let a stack-bug happen :)
> anyway a system should detect and terminate such applications.

Apart from runaway stack use the OS may want to allocate pages to the
stack as they are used. For example it might allocate space for a
stack of ten pages but only commit one of them. The page fault handler
would then distinguish between an extra page needed in the permitted
range and an attempt to allocate over the permitted range.

That points out a weakness of the loss of the stack segment in AMD64.
Say a routine allocates a large stack frame of 10k. (Unusual but
certainly possible.) After decrementing rsp by 10k even though there
is an unmapped guard page (at around the new rsp + 9k) the routine
might start tramping over memory below that guard page (at rsp + 0 and
above) before it tries to access the guard page at something like rsp
+ 8k. Hence it's writing over memory it shouldn't touch.

True. An option is rather than to save state on the stack to save it
in a thread-local scratchpad or, if only one interrupt can be in
service at a time, even a cpu-local scratchpad block. I don't think
the stack proper needs to hold much other than return addresses and a
few pointers.

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Harris  
View profile   Translate to Translated (View Original)
 More options 15 Oct, 16:40
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Thu, 15 Oct 2009 08:40:22 -0700 (PDT)
Local: Thurs 15 Oct 2009 16:40
Subject: Re: May your privileged stack always be big enough
On 13 Oct, 21:41, "Marven Lee" <mar...@invalid.invalid> wrote:

Thanks for the info.

Just out of curiosity why not use just one PL=0 stack for when in
privileged mode? Something to do with task switching?

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile   Translate to Translated (View Original)
 More options 16 Oct, 10:29
Newsgroups: alt.os.development
From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
Date: Fri, 16 Oct 2009 05:29:45 -0400
Local: Fri 16 Oct 2009 10:29
Subject: Re: May your privileged stack always be big enough
"James Harris" <james.harri...@googlemail.com> wrote in message

news:943fb89a-9433-41a7-a3f7-69ae44bb991a@j4g2000yqa.googlegroups.com...

> On 11 Oct, 09:43, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message
> news:72fd8b50-eaad-4764-8ff4-9ad64b530d5c@w19g2000yqk.googlegroups.com...

> > > I'm looking at the base Intel and AMD 64-bit architecture (which I'll
> > > call x86-64 herein) with a view to it influencing my 32-bit code. Why?
> > > Well, it seems sensible to design 32-bit operations which don't
> > > require too many changes to port to 64-bit later.

> > I was looking at interpreters to solve my "Just how do I get this
> > stuff to work on 64-bit if my compilers are only 32-bit?" problem.

> Do you mean how do you run 32-bit compilers under a 64-bit OS?

Sorry, I should've been a bit clearer.  If the C compilers I currently use
are only 32-bit, how do I compile my C code for 64-bit?  (Can't...)  E.g.,
"Just how do I get this stuff to work..."  So, either I get new compilers or
I find another method of "code portability", like an interpreter.

> > > So it's not even an option to detect stack
> > > overflow (a request for stack expansion) or underflow (trying to
> > > remove more than the stack holds) by reference to the stack segment.

> > "The preferred method of implementing memory protection in a long-mode
> > operating system is to rely on the page-protection mechanism..." - Sect.
> > 4.9 Vol 2 AMD64 Arch. Progr. Man. 2007

> > Answer?

> Sort of. It seems page protection is the *only* way to implement
> memory protection.

... for 64-bit segments.

> As an aside, perhaps this is a case of where competition has not been
> good [...]

If I drop the "consumer" part of that phrase, I'd definately agree.  I
personally would've preferred that they kept the execution model very close
to what already existed.  Changing the execution model, while perhaps a good
choice for the future, creates problems for programmers.

> [AA-64's design is a situation ...]
> where competition has not been
> good for the consumer.

Not sure if it matters at all to the generic or even adept consumer.  It
does matter to assembly OS programmers.  It shouldn't matter to HLL
programmers or to C OS programmers who don't see much assembly.

> Intel's design of the 80386 was brilliant.

I have the same issue here (or with the '286 actually) as with AA-64, I
personally would've preferred that they kept the execution model very close
to what already existed, i.e., a 24-bit RM, then 32-bit RM, instead of
adoption of a new PM model.

> Intel's design of the 80386 [...]
> stood the test of time.

Wasn't it based on established *NIX design?

> In particular, in terms of
> memory protection, forcing the overlap of code and data seems a 'bad
> thing.'

NX or XD bits?  While I'm not familiar with their use either, my
understanding was these were implemented (supposedly as MS' request) to
handle this issue.

> > [...] there is apparently "canonical" addressing or
> > sign-extension on the upper 20 bits that will generate a SS# if not all
> > zero's or all one's. However, that allows a large 48-bit address, "up
> > high" or "down low" when sign-extended to 64-bits, but only if in
> > the address is mapped into the page tables...

> Yes, canonical addressing is good. It ensures that unused upper bits
> are not places the programmer can squirrel-away extra data - and then
> find the code doesn't work when implementations use more bits for
> addressing.

That was a problem with, um..., 24-bit to 32-bit addressing on the Motorola
68k series, IIRC.  Was it from 24-bit '286 to 32-bit '386 also?

> IIRC paging
> must be enabled before changing to 64-bit mode.

That's an important difference although probably not for you.  But, if I
were to "design 32-bit operations which don't require too many changes to
port to 64-bit later" for my OS, I'd have to enable paging for 32-bits.

Rod Pemberton


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Harris  
View profile   Translate to Translated (View Original)
 More options 16 Oct, 11:58
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Fri, 16 Oct 2009 03:58:42 -0700 (PDT)
Local: Fri 16 Oct 2009 11:58
Subject: Re: May your privileged stack always be big enough
On 16 Oct, 10:29, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:

You know you can run 32-bit apps "unchanged" on x86-64 in
compatibility mode. If referring to non-apps - i.e. system code - for
the most part it seems to me to need a rewrite. Code for a 32-bit
kernel and code for a 64-bit kernel, at this point, seem to me to have
too many differences to use much of the same code, though they can
work with similar structures and concepts.

> > > > So it's not even an option to detect stack
> > > > overflow (a request for stack expansion) or underflow (trying to
> > > > remove more than the stack holds) by reference to the stack segment.

> > > "The preferred method of implementing memory protection in a long-mode
> > > operating system is to rely on the page-protection mechanism..." - Sect.
> > > 4.9 Vol 2 AMD64 Arch. Progr. Man. 2007

> > > Answer?

> > Sort of. It seems page protection is the *only* way to implement
> > memory protection.

> ... for 64-bit segments.

Yes

> > As an aside, perhaps this is a case of where competition has not been
> > good [...]

> If I drop the "consumer" part of that phrase, I'd definately agree.  I
> personally would've preferred that they kept the execution model very close
> to what already existed.  Changing the execution model, while perhaps a good
> choice for the future, creates problems for programmers.

> > [AA-64's design is a situation ...]
> > where competition has not been
> > good for the consumer.

By consumers I mean programmers generally. The average Windows user
won't see the differences.

> Not sure if it matters at all to the generic or even adept consumer.  It
> does matter to assembly OS programmers.  It shouldn't matter to HLL
> programmers or to C OS programmers who don't see much assembly.

You think even a C OS programmer won't see much difference? You may be
right. Maybe I've been focussing on the differences too much!

> > Intel's design of the 80386 was brilliant.

> I have the same issue here (or with the '286 actually) as with AA-64, I
> personally would've preferred that they kept the execution model very close
> to what already existed, i.e., a 24-bit RM, then 32-bit RM, instead of
> adoption of a new PM model.

> > Intel's design of the 80386 [...]
> > stood the test of time.

> Wasn't it based on established *NIX design?

I don't know. It was derived from the 286 for sure.

An odd thought: the 8086 had a genuinely bizarre use of segments. The
real mystery is where that came from. And, no, I haven't looked it up.
If I did I'd probably find out a good reason.... Anyway the segment
registers actually made sense when they switched to protected mode. I
know they weren't fast to load and at 16 bits were awkward to pass
around and other things but they at least made sense in protected
mode. It's as if Intel knew they would need them in the future so they
added them with a minuscule x16 offset years earlier. But I digress.

> > In particular, in terms of
> > memory protection, forcing the overlap of code and data seems a 'bad
> > thing.'

> NX or XD bits?  While I'm not familiar with their use either, my
> understanding was these were implemented (supposedly as MS' request) to
> handle this issue.

A good case in point. These had to be retro-fitted due to the code
segment being abandoned (or at least munged with the data segments)
and then people starting to execute data as code. This security hole
would never have arisen if OSes had kept data and code separate in the
first place. The no-execute bit in the page table is a fix for a
problem that didn't need to exist.

Of course, now it's touted as a big selling point: this processor
supports execute disable. One bit in a paging structure (which
wouldn't be needed if the OSes used the hardware protections already
provided) has become a celebrity.

Speaking of which I've been looking to see what processors support NX
or XD but all I can find is it depends on what CPUID says. Anyone know
of a list of processors which provide this support? It affects what I
need to write for those that don't.

> > > [...] there is apparently "canonical" addressing or
> > > sign-extension on the upper 20 bits that will generate a SS# if not all
> > > zero's or all one's. However, that allows a large 48-bit address, "up
> > > high" or "down low" when sign-extended to 64-bits, but only if in
> > > the address is mapped into the page tables...

> > Yes, canonical addressing is good. It ensures that unused upper bits
> > are not places the programmer can squirrel-away extra data - and then
> > find the code doesn't work when implementations use more bits for
> > addressing.

> That was a problem with, um..., 24-bit to 32-bit addressing on the Motorola
> 68k series, IIRC.  Was it from 24-bit '286 to 32-bit '386 also?

I don't know. Thankfully I never had to plan for the 286.

> > IIRC paging
> > must be enabled before changing to 64-bit mode.

> That's an important difference although probably not for you.  But, if I
> were to "design 32-bit operations which don't require too many changes to
> port to 64-bit later" for my OS, I'd have to enable paging for 32-bits.

True. Paging is part of my plans anyway but IIRC not yours.

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile   Translate to Translated (View Original)
 More options 16 Oct, 14:20
Newsgroups: alt.os.development
From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
Date: Fri, 16 Oct 2009 09:20:22 -0400
Local: Fri 16 Oct 2009 14:20
Subject: Re: May your privileged stack always be big enough
"James Harris" <james.harri...@googlemail.com> wrote in message

news:12637d43-e2fc-4da6-ba4f-4b1d4a48191f@m38g2000yqd.googlegroups.com...

> On 16 Oct, 10:29, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message

> > > [AA-64's design is a situation ...]
> > > where competition has not been
> > > good for the consumer.

> > Not sure if it matters at all to the generic or even adept consumer. It
> > does matter to assembly OS programmers. It shouldn't matter to HLL
> > programmers or to C OS programmers who don't see much assembly.

> You think even a C OS programmer won't see much difference? You may be
> right. Maybe I've been focussing on the differences too much!

Well, that's based on my current recollections of  my OS development
experiences...  I've done everything in C that I could do without using
assembly.  My OS is by no means devoid of assembly.  There is a bit.  The
assembly code is various 32-bit privileged instructions, interrupt wrappers,
code for my startup method, code for the cpu mode setup, misc. assembly
adjustments which I might've done the hard way, and code for a bunch of no
longer necessary design choices, which are in assembly inlined in C.  That
stuff would need a 64-bit rewrite.  The C code that deals with fixed size
fields of CPU data structures, e.g., filling in descriptors or interrupt
vectors, might need adjustments.  But, most of the C code, the non-special
areas, should be functionally the same when compiled for 64-bits.  But, I
don't have 64-bit compilers...

> This security hole
> would never have arisen if OSes had kept data and code separate in the
> first place.

So, you would dump the von Neumann architecture for a Harvard architecture?

I think the standardization of micro's on 8-bit bytes for ASCII and von
Neumann really helped languages like C and FORTH.  I know that C is more
difficult to implement if memory sizes aren't 8-bit byte based, e.g, 16-bit
word sized, or if integers and pointers are not equally sized, or if
different pointer types exist, etc.  I'd assume that the primary reason to
use Harvard would be to support different instruction and data sizes.  e.g.,
small RISC instruction set with large integer size.  I've never heard of
Harvard used to provide security, although I see no reason why it couldn't
be.  The question is: "Is data always tied to an instruction or are all
instructions data free?"  Typically, there is instruction data - data that
is part of an instruction like offsets - and non-instruction data such as
storing a register value in a memory location.  How does Harvard keep the
two separate, or how does it link the instruction data to an instruction?...
Separating the two introduces potential complexities.

Personally, if I'm coding in C, I don't care about the issue of code and
data separation.  The compiler takes care of it for me.  Of course, I'm
coding for 32-bit C where some of these issues are resolved without
segmentation present.  I'm not coding in C for 8086 with all it's different
memory models...  But, if I'm coding in assembly, I'm not usually interested
in keeping the code and non-instruction data separate.  It complicates
development by moving data outside the accessible offset ranges of
instructions and outside the local of the code using the variable.

> One bit in a paging structure (which
> wouldn't be needed if the OSes used the hardware protections already
> provided) has become a celebrity.

OS developers apparently learned recently what assembly programmers knew
twenty years ago.  That a flat non-segmented address space is optimal.
(somewhat seriously, somewhat sarcastically, somewhat humorously...)

IMO, using XD and NX to prevent buffer overflow attacks in C stackframes is
clearly the wrong solution to the problem.  The correct solution is two
stacks for C.  One for control flow information and the other for data.
Then, data cannot overwrite control flow information.

> Speaking of which I've been looking to see what processors support NX
> or XD but all I can find is it depends on what CPUID says. Anyone know
> of a list of processors which provide this support? It affects what I
> need to write for those that don't.

... just what Wikipedia says in "Hardware Background" and "Microsoft
Windows" sections:
http://en.wikipedia.org/wiki/XD_bit

Crud... requires PAE enabled too.

> Paging is part of my plans anyway but IIRC not yours.

It has definate advantages.  Reorganizing the address space anyway you see
fit is beneficial.  But, I don't need it yet.  My OS is just not developed
enough.  And, paging introduces the potential of page faults which, in my
mind, is a reliability issue I'll have to figure out how to fix.  So, until
I'm more familiar with paging and can look into minimizing, preferably
eliminating, page faults, I won't be doing much with paging anytime soon.

Rod Pemberton


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Harris  
View profile   Translate to Translated (View Original)
 More options 16 Oct, 18:19
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Fri, 16 Oct 2009 10:19:20 -0700 (PDT)
Local: Fri 16 Oct 2009 18:19
Subject: Re: May your privileged stack always be big enough
On 16 Oct, 14:20, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
...

> > This security hole
> > would never have arisen if OSes had kept data and code separate in the
> > first place.

> So, you would dump the von Neumann architecture for a Harvard architecture?

My thoughts are more along the lines of using information available. I
would fundamentally distinguish between

1. Machine code
2. Read-only data
3. Read-write data

To me these are different parts of an executable's image and imply
different treatments. Such distinctions can lead to gains in security
and efficiency. To wit, the NX bit and also the separate instruction
and data caches on CPUs. These caches provide a Harvard-like division
but there's not necessarily a need to go the whole hog and use
different buses etc.

> I think the standardization of micro's on 8-bit bytes for ASCII and von
> Neumann really helped languages like C and FORTH.  I know that C is more
> difficult to implement if memory sizes aren't 8-bit byte based, e.g, 16-bit
> word sized, or if integers and pointers are not equally sized, or if
> different pointer types exist, etc.  I'd assume that the primary reason to
> use Harvard would be to support different instruction and data sizes.  e.g.,
> small RISC instruction set with large integer size.  I've never heard of
> Harvard used to provide security, although I see no reason why it couldn't
> be.  The question is: "Is data always tied to an instruction or are all
> instructions data free?"  Typically, there is instruction data - data that
> is part of an instruction like offsets - and non-instruction data such as
> storing a register value in a memory location.  How does Harvard keep the
> two separate, or how does it link the instruction data to an instruction?...

I don't know. I've never really had a reason to look at Harvard
architectures but have no problem with an instruction having an
immediate operand. Otherwise, accessing code as an offset from a base
code address and data as an offset from a base data address seem fine
to me.

> Separating the two introduces potential complexities.

> Personally, if I'm coding in C, I don't care about the issue of code and
> data separation.  The compiler takes care of it for me.  Of course, I'm
> coding for 32-bit C where some of these issues are resolved without
> segmentation present.  I'm not coding in C for 8086 with all it's different
> memory models...  But, if I'm coding in assembly, I'm not usually interested
> in keeping the code and non-instruction data separate.  It complicates
> development by moving data outside the accessible offset ranges of
> instructions and outside the local of the code using the variable.

The different memory models: tiny, big, huge etc or whatever they were
called were always a bad idea, IMHO. Why? Well, as programmers we want
to express algorithms and solve computational problems. Much of the
mechanism used for these models seems to me to be a different level of
abstraction.

As for your comment about accessing data within reasonable offsets of
instructions, well, it's not a model I care much for. Given that, you
can guess that AMD found another way to annoy me with the rip-relative
addressing they added in x86-64. It helps to enshrine the old
fashioned load image model where the data sits at certain offsets from
the code. I think we should be leaving that old model behind, not
encouraging it.

Sorry if this sounds like a diatribe. It's not meant to be: more a
(very brief) explanation. In fairness we can always program round
these things and it's that which has been occupying my mind for the
past while as I've been looking at AMD's x86-64.

One positive thing I found in their design is the swapgs instruction.
It allows a called routine to *quickly* find its working data - but
only if that routine is the PL 0 kernel and only if it has been called
from PL 3.

> > One bit in a paging structure (which
> > wouldn't be needed if the OSes used the hardware protections already
> > provided) has become a celebrity.

> OS developers apparently learned recently what assembly programmers knew
> twenty years ago.  That a flat non-segmented address space is optimal.
> (somewhat seriously, somewhat sarcastically, somewhat humorously...)

A flat model is good in some ways but it has limitations. Say we want
to expand the size of a region of memory. Think of realloc in C. If
the memory into which the region would expand is occupied we have to
reallocate elsewhere (if memory or address space is available), copy
data, repoint and then remove the old mapping. And even then, any
pointers into the old region will be incorrect. A two-dimensional view
of memory would be better here and make this stuff much easier and
faster.

I think there may be ways to ameliorate these problems somewhat ...
but there's no gain without pain elsewhere.

> IMO, using XD and NX to prevent buffer overflow attacks in C stackframes is
> clearly the wrong solution to the problem.  The correct solution is two
> stacks for C.  One for control flow information and the other for data.
> Then, data cannot overwrite control flow information.

An interesting idea. Of course, in C any pointer that's, er,
mispointed can overwrite control flow or any other info but that is
because of current models. Developing your suggestion the return
addresses stack could be protected against update by instructions and
the data stack could perhaps also be protected against accesses which
are not off the stack pointer or frame pointer.

Of course, setting up stack frames is not mandated by C is it? You
could always store arguments and parameters elsewhere in memory and
pass a pointer to them. Then the one and only stack would just hold
return addresses. (It would maybe also hold the parameter block
pointer too if you didn't want to pass it in a register.)

> > Speaking of which I've been looking to see what processors support NX
> > or XD but all I can find is it depends on what CPUID says. Anyone know
> > of a list of processors which provide this support? It affects what I
> > need to write for those that don't.

> ... just what Wikipedia says in "Hardware Background" and "Microsoft
> Windows" sections:http://en.wikipedia.org/wiki/XD_bit

> Crud... requires PAE enabled too.

Ah, I thought something else was needed but wasn't sure what. Looks
like paging is the way to go, Rod. Go on, add it in. You know you want
to really. :-)

> > Paging is part of my plans anyway but IIRC not yours.

> It has definate advantages.  Reorganizing the address space anyway you see
> fit is beneficial.  But, I don't need it yet.  My OS is just not developed
> enough.  And, paging introduces the potential of page faults which, in my
> mind, is a reliability issue I'll have to figure out how to fix.  So, until
> I'm more familiar with paging and can look into minimizing, preferably
> eliminating, page faults, I won't be doing much with paging anytime soon.

You may find it difficult to add paging with your current boot method.
When you get back to OS dev you may want to look at a more
conventional boot method. Then you have full control of a virgin
machine with no cruft from someone else's operating system.

  http://foldoc.org/cruft

BTW, although page faults are called faults it doesn't imply
faultiness or lack of reliability. :-) I know you didn't mean that but
don't forget that you can start by identity mapping all memory - or at
least all of it you intend to use. Then linear addresses will be equal
to physical addresses. Mark all pages present, writable and with the
appropriate privilege level and you won't have any page faults either.
Once that's working you can delay really making use of paging until
you are ready.

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile   Translate to Translated (View Original)
 More options 17 Oct, 10:06
Newsgroups: alt.os.development
From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
Date: Sat, 17 Oct 2009 05:06:38 -0400
Local: Sat 17 Oct 2009 10:06
Subject: Re: May your privileged stack always be big enough
"James Harris" <james.harri...@googlemail.com> wrote in message

news:52df9cae-91a0-4865-9c58-eb29facabb18@b2g2000yqi.googlegroups.com...

> Given that, you
> can guess that AMD found another way to annoy me with the rip-relative
> addressing they added in x86-64.

Doesn't x86-64 have two types of addressing?...

> One positive thing I found in their design is the swapgs instruction.

Hmm, I entered "swapgs" into Yahoo to find out what it does.  The only
things that come up are security vulnerabilities...

> It allows a called routine to *quickly* find its working data - but
> only if that routine is the PL 0 kernel and only if it has been called
> from PL 3.

...

> Developing your suggestion the return
> addresses stack could be protected against update by instructions and
> the data stack could perhaps also be protected against accesses which
> are not off the stack pointer or frame pointer.

Of course, without this being done in hardware by the memory manager, one
could by bypass it using assembly.

> Of course, setting up stack frames is not mandated by C is it?

No.  But, C requires recursion.  I don't know all the info on this, but
apparently computer scientists (CS) proved in the 1960's or 1950's that a
stack using stackframes was the easiest way to implement recursion.

> You
> could always store arguments and parameters elsewhere in memory and
> pass a pointer to them. Then the one and only stack would just hold
> return addresses. (It would maybe also hold the parameter block
> pointer too if you didn't want to pass it in a register.)

A stack in memory...

> Looks like paging is the way to go, Rod.

I'm not sure it's "the way".  But, it's apparently the only choice
remaining...

I remember GUI OSes (Amiga, Mac) working well on Motorola 68000 cpu's
without hardware MMU's.  I'm not sure if there were MMU features implemented
in software or not.  I think that would've been unlikely given the
processing power, or lack of, they had at the time.  The 68k series added an
 external MMU with the later 68020 processor.

> When you get back to OS dev you may want to look at a more
> conventional boot method

I've got two projects I'm more interested in at the moment.

> Then you have full control of a virgin
> machine with no cruft from someone else's operating system.

There's not much cruft from starting from DOS.  There's far more cruft from
my choice, originally, to use DOS C compiler libraries and executables...
I'll need to remove that stuff from my OS.  If I can ever get my C-ish
compilers working, then I'll have control of a compiler and won't have to
deal with pre-existing conditions and bugs of other compilers.  Attempts at
minimalism are in the works here...

The cruft from starting from DOS is basically just the same stuff I'd have
to do from a bootloader.  Ignoring my special DOS TSR startup method, I'm in
16-bit RM when I start - just like from a bootloader.  And, I start compiled
32-bit C code from 16-bit RM.  One needs to setup the standard 16-bit RM to
32-bit PM switch as well as an instruction pointer, stack pointer, and old
stack pointer.

I do the basic cpu startup in 16/32-bit assembly, but I also rerun cpu setup
using inlined 32-bit C.  That sounds redudant, but MultiBoot, e.g. GRUB,
passes control in raw 32-bit mode...  The assembly is minimal 32-bit setup,
while the C code is a more thorough, IIRC.  If I do my own bootloader, I can
move the assembly there.  The OpenWatcom C code is low or no cruft.  The
DJGPP C code uses and accesses a few things in it's C libraries and C
startup that I have to recreate.  When I get around to eliminating all C
library code, some of that will be eliminated.  The bootloader should
eliminate the rest.  Usually, people write their own kernel library to make
sure the C functions are safe.  I chose to use existing C libraries to speed
up development.  But, I also made sure I'm only using OS safe functions.  I
just realized while writing this that I'm not really sure why I did that.  I
don't use the C libraries much, if at all, for most of my other programs.
cliche: "Hindsight..."

> BTW, although page faults are called faults it doesn't imply
> faultiness or lack of reliability.

Well, I recall seeing a chart in one of the manuals and didn't like some of
the situations which generated them.

> [...] you can start by identity mapping all memory - or at
> least all of it you intend to use. Then linear addresses will be equal
> to physical addresses. Mark all pages present, writable and with the
> appropriate privilege level and you won't have any page faults either.
> Once that's working you can delay really making use of paging until
> you are ready.

Good to know!

Rod Pemberton


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Harris  
View profile   Translate to Translated (View Original)
 More options 26 Oct, 14:58
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Mon, 26 Oct 2009 07:58:53 -0700 (PDT)
Local: Mon 26 Oct 2009 14:58
Subject: Re: May your privileged stack always be big enough
On 17 Oct, 09:06, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:

> "James Harris" <james.harri...@googlemail.com> wrote in message

> news:52df9cae-91a0-4865-9c58-eb29facabb18@b2g2000yqi.googlegroups.com...

> > Given that, you
> > can guess that AMD found another way to annoy me with the rip-relative
> > addressing they added in x86-64.

> Doesn't x86-64 have two types of addressing?...

Two types? Rip-relative is additional to existing addressing allowing
data areas to be addressed relative to code - yuck. :-(

> > One positive thing I found in their design is the swapgs instruction.

> Hmm, I entered "swapgs" into Yahoo to find out what it does.  The only
> things that come up are security vulnerabilities...

Well, any program has to be written properly.

..

> > Of course, setting up stack frames is not mandated by C is it?

> No.  But, C requires recursion.  I don't know all the info on this, but
> apparently computer scientists (CS) proved in the 1960's or 1950's that a
> stack using stackframes was the easiest way to implement recursion.

Thanks for the reminder. I mustn't lose sight of this. However, many
of the functions of an OS do not have to be recursive. For example,
the address space manager, the scheduler, interrupt service routines,
device drivers etc. These don't need stack frames when faster
mechanisms are available. ISTM that OS design allows us not so much to
break the rules but to define the rules. Normal constraints don't
always apply. We can design our own call interface or interfaces if
there's good reason to do so, such as for performance.

> > When you get back to OS dev you may want to look at a more
> > conventional boot method

> I've got two projects I'm more interested in at the moment.

I know.

> > Then you have full control of a virgin
> > machine with no cruft from someone else's operating system.

> There's not much cruft from starting from DOS.

OK.

...

> > BTW, although page faults are called faults it doesn't imply
> > faultiness or lack of reliability.

> Well, I recall seeing a chart in one of the manuals and didn't like some of
> the situations which generated them.

Sure, like GPFs. They may not be nice in that they have many potential
causes but we can control what we allow to trigger them. Page faults
are easier to manage than GPFs - or the dreaded double faults.

...

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Marven Lee  
View profile   Translate to Translated (View Original)
 More options 28 Oct, 13:16
Newsgroups: alt.os.development
From: "Marven Lee" <mar...@invalid.invalid>
Date: Wed, 28 Oct 2009 13:16:06 -0000
Local: Wed 28 Oct 2009 13:16
Subject: Re: May your privileged stack always be big enough

>James Harris wrote...
>>Marven Lee wrote...
>> James Harris wrote...
>> > 3) If the privileged stack must always be large enough how much space
>> > should be set aside? If it is only used to service interrupts and
>> > syscalls it probably doesn't need to be very big. A 4k page seems much
>> > too large. The bulk of the state can be saved in a thread image if
>> > desirable.

>> I think I use 4k (or maybe 8k) stacks per task in the kernel, these
>> stacks
>> are used during syscalls and for kernel tasks. I also have a single,
>> separate stack for handling hardware interrupts.

[...]

>Just out of curiosity why not use just one PL=0 stack for when in
>privileged mode? Something to do with task switching?

I believe QNX is one of the few operating systems to use a single
kernel stack per processor.   This is called an interrupt-model
kernel.  Most other operating systems follow the process-model
kernel with a kernel stack for each task.

With a single kernel stack it is not possible for more than one
task to enter the kernel,  so multitasking -within- the kernel is
not easily possible.   It is not possible to block or sleep
within the kernel (and remain in the kernel).   Well I suppose you
could block on a spinlock waiting for a task on another processor.

Years ago on my first OS project I tried to write a microkernel
that used an IPC mechanism similar to QNX's MsgSend,
MsgReceive, MsgReply and MsgRead/MsgWrite functions.
 I ended up using an interrupt model kernel with a single
kernel stack,   I got the IPC working but the rest of my code
became a mess so started on something different.

So in QNX (and in my OS) only basic synchronization primitives,
timer handling and message copying were implemented in the
kernel.

Synchronization primitives could be done simply,  without
interrupts being enabled for the duration of the syscall.  Register
state would be saved in the process structure instead of on the
stack upon entry to the kernel.  The main part of the syscall
would then be done without blocking,  the current process
and other processes could be added or removed from the
run-queue though.

A syscall would typically look like this...

SyscallWrapper:
    Save registers onto current_process
    DoSyscall()    / bulk of syscall
                     /  may call SchedReady(proc) or
                    /  SchedUnready(proc)  to add or remove
                    / processes from run queue.  No blocking
                   / occurs during execution of DoSyscall.
    current_process = PickProc()
    Load registers from current_process
    Return from interrupt

It would be possible to enable interrupts during the execution of
DoSyscall(),  but the code that adds and removes processes
from the run queue needs some form of locking.

For short functions that are short in duration,  interrupts could
be disabled for the entirety without causing missed interrupts.

For longer functions,  such as copying messages during message
passing, interrupts would be enabled.  The interrupt may wake
up another process or a quanta expires which could require
a reschedule of the current process.   So the message copying
code needs to check every so often that it is the highest
running priority task,  else return from interrupt to the new task.
This requires some modification,  of the above syscall wrapper
code to handle these cases and also to resume message passing
when the task doing the message passing is run again.

I can't remember much more about my old code other than it
got a bit messy.  My current OS uses the kernel stack per
process plus the separate interrupt stack.   I haven't worked
on it for some time though for lots of reasons.  I seem to drift
in and out of osdev.

--
Marv


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile   Translate to Translated (View Original)
 More options 2 Nov, 09:34
Newsgroups: alt.os.development
From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
Date: Mon, 2 Nov 2009 04:34:33 -0500
Local: Mon 2 Nov 2009 09:34
Subject: Re: May your privileged stack always be big enough
"James Harris" <james.harri...@googlemail.com> wrote in message

news:70639651-c310-4929-ac33-35915173a830@j9g2000vbp.googlegroups.com...

> On 17 Oct, 09:06, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message
> > news:52df9cae-91a0-4865-9c58-eb29facabb18@b2g2000yqi.googlegroups.com...

> > > Given that, you
> > > can guess that AMD found another way to annoy me with the rip-relative
> > > addressing they added in x86-64.

> > Doesn't x86-64 have two types of addressing?...

> Two types? Rip-relative is additional to existing addressing allowing
> data areas to be addressed relative to code - yuck. :-(

That should allow easy implementation of position independent code, yes?
Does that enhance the usefulness of paging?

> However, many
> of the functions of an OS do not have to be recursive.

Recursive functions were "nifty" when first learning how to program...
Otherwise, I've found they are a) substitutes for loops b) a way of avoiding
structured programming and c) use the stack and/or stackframes for
uncontrolled memory allocation.  So, I tend to avoid recursion.

> For example,
> the address space manager, the scheduler, interrupt service routines,
> device drivers etc. These don't need stack frames when faster
> mechanisms are available.

If you're using a C compiler, you might - depending on which compiler - be
able to eliminate the stack frame by declaring the function as "void
function(void)", or __declspec(naked), or perhaps with an __attribute__ for
GCC - although *I* haven't determined which attribute...  Let me know if you
find for GCC.

> ISTM that OS design allows us not so much to
> break the rules but to define the rules.

What rules?  :)  My code.  My rul... Wait.  What rules?

Rod Pemberton


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Harris  
View profile   Translate to Translated (View Original)
 More options 2 Nov, 15:39
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Mon, 2 Nov 2009 07:39:33 -0800 (PST)
Local: Mon 2 Nov 2009 15:39
Subject: Re: May your privileged stack always be big enough
On 2 Nov, 09:34, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:

...

> > > > Given that, you
> > > > can guess that AMD found another way to annoy me with the rip-relative
> > > > addressing they added in x86-64.

> > > Doesn't x86-64 have two types of addressing?...

> > Two types? Rip-relative is additional to existing addressing allowing
> > data areas to be addressed relative to code - yuck. :-(

> That should allow easy implementation of position independent code, yes?

Does it? Existing jump instructions - both conditional and
unconditional - jump relative the the program counter so code-to-code
references can already be position independent.

AIUI RIP-relative addressing allows *data* addressing relative to the
instruction pointer. Is that a step forward...? I can't see it. For
example, for performance reasons it makes sense to allow multiple
instances of a program to use the same code. Now, how do we give each
instance its own separate data address? RIP-relative addressing
encourages sticking them all together in the old [code, data, bss,
heap, stack] sequence.

> Does that enhance the usefulness of paging?

I don't know what you mean. What do you have in mind?

...

> > For example,
> > the address space manager, the scheduler, interrupt service routines,
> > device drivers etc. These don't need stack frames when faster
> > mechanisms are available.

> If you're using a C compiler, you might - depending on which compiler - be
> able to eliminate the stack frame by declaring the function as "void
> function(void)", or __declspec(naked), or perhaps with an __attribute__ for
> GCC - although *I* haven't determined which attribute...  Let me know if you
> find for GCC.

My plans for memory layout are generally formulating round assembler
rather than C. This is deliberate. I want to define the best execution
model possible. (For "best" read: most efficient, simplest, most
flexible.) I don't want to be influenced by the cruft imposed by a
compiler.

That said, I will look at supporting other models later. I just don't
want to mandate their use. If I build the models around one or more
preexisting compilers TheI will most likely end up with the models the
compilers use.

> > ISTM that OS design allows us not so much to
> > break the rules but to define the rules.

> What rules?  :)  My code.  My rul... Wait.  What rules?

The rules as to how programs operate, how the OS switches between
them, how the OS responds to different protection requirements. That
sort of thing.

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile   Translate to Translated (View Original)
 More options 2 Nov, 21:31
Newsgroups: alt.os.development
From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
Date: Mon, 2 Nov 2009 16:31:54 -0500
Local: Mon 2 Nov 2009 21:31
Subject: Re: May your privileged stack always be big enough
"James Harris" <james.harri...@googlemail.com> wrote in message

news:ad476704-5683-4f51-ae19-99fef9b0926d@n35g2000yqm.googlegroups.com...

> On 2 Nov, 09:34, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
> > [...]

> AIUI RIP-relative addressing allows *data* addressing relative to the
> instruction pointer.

If one uses mixed code and data, then if I move the position of the block of
code and data, then using RIP addressing for data, nothing changes for
offsets to either code or data... if I understood what you're saying since I
have yet to read up on RIP.  But, isn't the ability to relocate code and
data what is called "position independent code"?  You can move the code and
data anywhere, and no offsets need to be recomputed?

From the NASM documentation NASM defines five special symbols to generate
PIC for ELF in assembly.  While I've not used them, I'd guess the
RIP-relative addressing would eliminate most of them, if not all.  The key
one seems to be "wrt", i.e., "with respect to", which as I trivially
understand it, allows one to compute data offsets relative to other fixed
locations.  These offsets seem to be computed relative to a "got", i.e.,
"global offset table".

> Is that a step forward...? I can't see it.

Not sure.  It seems to add support for mixed code and data or position
independent code, which can improve locality of data in a cache... more
speed?  Compared to 6502 assembly - actually my aging recollections of it -
are that x86 doesn't implement some of the powerful relative addressing
modes it had.

> For
> example, for performance reasons it makes sense to allow multiple
> instances of a program to use the same code.

Ok.

> Now, how do we give each
> instance its own separate data address? RIP-relative addressing
> encourages sticking them all together in the old [code, data, bss,
> heap, stack] sequence.

Well, don't use RIP then... (?)  It's not forced is it?

> > Does that enhance the usefulness of paging?

> I don't know what you mean. What do you have in mind?

Well, if RIP-relative addressing for data allows mixed code and data to be
more position independent, then doesn't that mean that paging becomes more
effective? or easier?  You can just put the block of code and data wherever
as long as it's page sized and page aligned.  I.e., no need to recompute
data offsets, etc... (?)

> My plans for memory layout are generally formulating round assembler
> rather than C.

If there is a conflict, how do you plan to implement C? Or, some other HLL?

> This is deliberate. I want to define the best execution
> model possible. (For "best" read: most efficient, simplest, most
> flexible.) I don't want to be influenced by the cruft imposed by a
> compiler.

Ah.  Well, I'm not sure there is that much "cruft" with modern C compilers.
About the only truly hidden thing the x86 C compiler's I've used do to the
emitted assembly - other than optimization - is construct stackframes around
procedures.  If you're using a stack in assembly, you might be creating
stackframes already.

Rod Pemberton


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Harris  
View profile   Translate to Translated (View Original)
 More options 3 Nov, 00:38
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Mon, 2 Nov 2009 16:38:23 -0800 (PST)
Local: Tues 3 Nov 2009 00:38
Subject: Re: May your privileged stack always be big enough
On 2 Nov, 21:31, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:

You introduced the term into the discussion so I guess it's your call
as to what it means here. Perhaps in general it could be taken to mean
code that does not need to have fixups applied when it is loaded. If
the code can be loaded without fixups it's position-independent. If
the entire file can be loaded without relocation perhaps that needs a
different term - like position-independent image - but that's a term
I've just made up.

The traditional single-image layout where the text, data and bss
sections are at the bottom and the stack is at the top with heap in
between doesn't seem to allow much relocation. Instead it seems to me
to mandate running each image in its own address space. This makes for
slower switching between processes.

> From the NASM documentation NASM defines five special symbols to generate
> PIC for ELF in assembly.  While I've not used them, I'd guess the
> RIP-relative addressing would eliminate most of them, if not all.  The key
> one seems to be "wrt", i.e., "with respect to", which as I trivially
> understand it, allows one to compute data offsets relative to other fixed
> locations.  These offsets seem to be computed relative to a "got", i.e.,
> "global offset table".

I took a look at the nasm doc on this but it looks complicated and I
haven't studied it.

> > Is that a step forward...? I can't see it.

> Not sure.  It seems to add support for mixed code and data or position
> independent code, which can improve locality of data in a cache... more
> speed?  Compared to 6502 assembly - actually my aging recollections of it -
> are that x86 doesn't implement some of the powerful relative addressing
> modes it had.

Interesting. I too wrote some early machine code for the 6502. IIRC it
had addressing relative to X and Y registers and needed zero-page
space. (Ring any bells?) It was great at the time but only *if* there
were zero-page spaces available. If these were in short supply the
indexing was very constrained. Even if there were spaces avaliable it
was primitive. You sure you aren't looking through a romantic haze of
nostalgia? :-)

> > For
> > example, for performance reasons it makes sense to allow multiple
> > instances of a program to use the same code.

> Ok.

> > Now, how do we give each
> > instance its own separate data address? RIP-relative addressing
> > encourages sticking them all together in the old [code, data, bss,
> > heap, stack] sequence.

> Well, don't use RIP then... (?)  It's not forced is it?

No. I have been careful to say its addition *encourages* lumping data
along with code. It doesn't mandate it. That said, in protected mode
we had an effective DS register that is no longer there. Instead, for
OS code at least we are encouraged to use GS to locate local data. The
abolition of one segment register and the promotion of another seems
inconsistent.

One option for running multiple instances of a piece of code is to
reserve a page for offsets or pointers and then change just that page
when switching between tasks. The page would be at a fixed location.
It's a slight workaround but as long as invlpg invalidates just one
page (which it should do but it's not guaranteed) it ought to be fast.
It's perhaps the best option I have (for lightweight task switching
between homogenous tasks). The switched page would always be at the
same location but each version would point to its local locations in
the address space.

Another option is to use GS to point to each instance's local data.

Yet another one is to use different models in 32-bit and 64-bit modes.
The massive address space in 64-bit mode allows a different approach
to achieve similar results in memory management.

> > > Does that enhance the usefulness of paging?

> > I don't know what you mean. What do you have in mind?

> Well, if RIP-relative addressing for data allows mixed code and data to be
> more position independent, then doesn't that mean that paging becomes more
> effective? or easier?  You can just put the block of code and data wherever
> as long as it's page sized and page aligned.  I.e., no need to recompute
> data offsets, etc... (?)

I don't know. You can always align code and data in 32-bit mode. I can
see the advantage of loading a position-independent code image - as
long as only one instance of it is needed. If more instances are
needed it looks like they need their own address spaces - which are
slower to switch between. Of course, people tend to use threads with
their associated pros and cons.

> > My plans for memory layout are generally formulating round assembler
> > rather than C.

> If there is a conflict, how do you plan to implement C? Or, some other HLL?

Multiple models can be supported, I think. Having a simpler
lightweight model doesn't precude supporting a more traditional model
in a different address space. At least that's the plan. In theory the
compiler could write an object file appropriate to any supported
model.

The point is to prevent the OS design being controlled by existing
compilers - to prevent the tail from wagging the dog if you like.

> > This is deliberate. I want to define the best execution
> > model possible. (For "best" read: most efficient, simplest, most
> > flexible.) I don't want to be influenced by the cruft imposed by a
> > compiler.

> Ah.  Well, I'm not sure there is that much "cruft" with modern C compilers.
> About the only truly hidden thing the x86 C compiler's I've used do to the
> emitted assembly - other than optimization - is construct stackframes around
> procedures.  If you're using a stack in assembly, you might be creating
> stackframes already.

That's just it you see - I'm not forcing the use of stack frames for
every call. One model I'm playing with is modules which have
persistent activation records. These cannot sit on the stack.
Functions which are not recursive don't need a stack frame either.

A C compiler would come with this cruft. (I stand by the term.) And
even though C is low level by design it still doesn't allow the call
scheme to be controlled. The average compiler implements calls and
parameter passing the way it implements them and that's it.

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile   Translate to Translated (View Original)
 More options 3 Nov, 09:43
Newsgroups: alt.os.development
From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
Date: Tue, 3 Nov 2009 04:43:39 -0500
Local: Tues 3 Nov 2009 09:43
Subject: Re: May your privileged stack always be big enough
"James Harris" <james.harri...@googlemail.com> wrote in message

news:6761173d-4f2b-43e2-807f-abd40d8c9c9d@37g2000yqm.googlegroups.com...

> The traditional single-image layout where the text, data and bss
> sections are at the bottom and the stack is at the top with heap in
> between doesn't seem to allow much relocation. Instead it seems to me
> to mandate running each image in its own address space. This makes for
> slower switching between processes.

Oooo... I must've misunderstood something.

I basically consider a "single-image layout" to be the binary data that
comprises the executable application, i.e., a file.  The text (code) and
data comprise the "image".  The bss is allocated after the "image" in memory
by the executable loader/startup and cleared too.  The stack and heap are
wherever the OS places them, AIUI.  They can be one of each or many of each
or whatever..., AIUI.

> Interesting. I too wrote some early machine code for the 6502. IIRC it
> had addressing relative to X and Y registers and needed zero-page
> space. (Ring any bells?)

Yup, faint ones way off in the distance... 6510 actually.  Unless one coded
on the C64, people ask: "What's a 6510?"...  It's a 6502 with a port.

Appendix L of the C64's Programmer's Reference manual is most of (or all
of...) the MOS Tech datasheet for the 6510.  I'm thinking of page 416 and
417.  It shows the 6510 instruction set and thirteen (13) addressing modes.
Appendix L is available as "Chapter 7 - C64 Programmers Reference -
Appendices" in almost 10MB .pdf form at the link:

http://www.commodore.ca/manuals/c64_programmers_reference/c64-program...
(Right-click "save-as" the .pdf, don't try to load it directly from their
slow site.)

Commodore.ca
http://www.commodore.ca

> It was great at the time but only *if* there
> were zero-page spaces available.

IIRC, compared to other microprocessors at the time, it was what I'd call a
memory based and/or "load-store" design - few registers, fast zero page
memory instructions, accumulator/memory based programming model, etc.  IIRC,
most of the magazines at the time called it load-store microprocessor too...
IIRC, others, like the Z80 - which I never programmed - used registers far
more heavily.  Unfortunately, the term "load-store" has been warped over the
ages.  Wikipedia's usage doesn't match with my recollections.

> You sure you aren't looking through a romantic haze of
> nostalgia? :-)

Probably, but with 13 addressing modes, who knows...  ;-)

> That said, in protected mode
> we had an effective DS register that is no longer there. Instead, for
> OS code at least we are encouraged to use GS to locate local data.

A good reason to at least consider nanokernels...

> The
> abolition of ... and the promotion of ... seems
> inconsistent.

That general problem: inconsistency, between 16-bit, 32-bit, and 64-bit x86,
is another reason I'm considering interpreters *alot* lately.  I don't like
their slowness, but if 128-bit comes out in a year or two, am I to rewrite
everything again?  If 64-bit AA isn't "whack" enough, what if they do
something really, really radical for 256-bits, e.g., to conserve memory?
Who wants 256-bit offsets and pointers?  Code bloat?  Justification for
RIP-relative addressing?  What happens if the ARM microprocessor eventually
replaces the x86 in low-end laptop PC's?  The continuous obsolescence and
rejuvenation cycle of PC's and PC OSes is a real time and life "killer",
IMO.  I fully understand why *nix users don't want to give up *nix after
decades of use.  I didn't want to give up my C64 either...

> One option for running multiple instances of a piece of code is to
> reserve a page for offsets or pointers and then change just that page
> when switching between tasks. The page would be at a fixed location.
> It's a slight workaround but as long as invlpg invalidates just one
> page (which it should do but it's not guaranteed) it ought to be fast.
> It's perhaps the best option I have (for lightweight task switching
> between homogenous tasks). The switched page would always be at the
> same location but each version would point to its local locations in
> the address space.

Yeah, I'm _not_ sold on _not_ keeping applications completely separated.  I
think we may have conversed on this previously...

> Yet another one is to use different models in 32-bit and 64-bit modes.
> The massive address space in 64-bit mode allows a different approach
> to achieve similar results in memory management.

Except for the negative speed issue, I'm slowly becoming more and more
convinced interpreters are the solution.  I can stop and start an
interpreter's emulation at will without risk of losing control of processor
execution, e.g., no control-flow hacks from buffer overruns etc.  Just how
do you crash an interpreter?  They are easy to write and maintain.  They can
produce fairly compact "code" as byte or token sequences.  They don't need
porting or recompiling for different cpu modes.  But, they do need a
different interpreter for each cpu mode or environment.

> The point is to prevent the OS design being controlled by existing
> compilers - to prevent the tail from wagging the dog if you like.

Ah.  Well, I agree "bare metal" is good.  As a slight aside, Alexei proved
to me that bare metal does hide some errors, and that I should eventually
test on some emulators too.

> persistent activation records.

I'm not familiar with that term.  A few quick search pulls up "Napier88" and
"persistent programming language" and "orthogonal programming" and:

"Using C as a Compiler Target Language for Native Code Generation in
Persistent Systems" by S.J. Bushell, A. Dearle, A.L. Brown, & F.A. Vaughan.

But, you aren't using C...

http://en.wikipedia.org/wiki/Orthogonal_persistency
http://en.wikipedia.org/wiki/Persistent_programming_language

"Orthogonal persistence" from one of those seems to indicate a permanent or
semi-permanent state, like an OS which doesn't start up or shutdown but
resumes where it was previously.  FORTH, when used as an OS, is sort-of
similar.  All changes to the FORTH environment get preserved.

The other indicates something about data continuing to exist after an
application has been closed.  However, it doesn't really specify what a
continuing existence means.  Is the data "existing" in memory?  Is the data
"existing" on disk? If memory, how is it accessable outside the application?
etc.

> Functions which are not recursive don't need a stack frame either.

? ? ? ? ?

Hmm, I'll have to think about that statement for a while...

As a quick note, I avoid recursion and currently "see" beneficial use of
stack frames, e.g.,

 - allowing automatic ("auto" in C) allocation and cleanup of local
variables in procedures, functions, subroutines
 - helping to implement call-by-value
 - reduces memory space needed by code for variables with temporary scope

> cruft. (I stand by the term.)

Y'know, it's claims like that lead to nickames that one hates... Right?
e.g., James "Crufty" Harris...  :-)

> even though C is low level by design it still doesn't allow the call
> scheme to be controlled.

True.  I've chosen C, for the most part.  It can't do everything needed by
an OS: "some assembly is required".

> The average compiler implements calls and
> parameter passing the way it implements them and that's it.

How many calling conventions and/or parameter passing methods does one need?
As I see it a system should only need one (or two like typical C...).
Eventually everything on the system uses that one.

Rod Pemberton


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Harris  
View profile   Translate to Translated (View Original)
 More options 3 Nov, 11:51
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Tue, 3 Nov 2009 03:51:15 -0800 (PST)
Local: Tues 3 Nov 2009 11:51
Subject: Re: May your privileged stack always be big enough
On 3 Nov, 09:43, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:

You are right. I was talking about a basic image of a program in
execution rather than that from disk.

...

> on the C64, people ask: "What's a 6510?"...  It's a 6502 with a port.

...

> IIRC, compared to other microprocessors at the time, it was what I'd call a
> memory based and/or "load-store" design - few registers, fast zero page
> memory instructions, accumulator/memory based programming model, etc.  IIRC,
> most of the magazines at the time called it load-store microprocessor too...
> IIRC, others, like the Z80 - which I never programmed - used registers far
> more heavily.

The big advantage of the Z80 architecture was that it had 16-bit
registers - and there were more of them. IIRC the humble 6502 had only
one 8-bit accumulator and two 8-bit index registers. Sounds poor now
but my recollection is that it was still great fun to use. Happy
memories! I wrote a disassembler for it but, IIRC, we still wrote the
programs in machine code. Of course its machine code was much easier
to remember. There weren't the 2- and 3-bit fields and the variable
length opcodes we have today.

>  Unfortunately, the term "load-store" has been warped over the
> ages.  Wikipedia's usage doesn't match with my recollections.

Would referring the number of operands an instruction takes change
less over time?

  http://en.wikipedia.org/wiki/Instruction_set#Number_of_operands

...

> Ah.  Well, I agree "bare metal" is good.  As a slight aside, Alexei proved
> to me that bare metal does hide some errors, and that I should eventually
> test on some emulators too.

Was that on a.o.d? I'd like to look in to that some more.

> > persistent activation records.

> I'm not familiar with that term.  A few quick search pulls up "Napier88" and
> "persistent programming language" and "orthogonal programming" and:

It wasn't meant as a term in itself. I was just talking about making
an activation record persistent - i.e. to last between one activation
and another. I think this can have both performance and security
gains. (Less state to set up and static variables can stay in scope.)

...

> Y'know, it's claims like that lead to nickames that one hates... Right?
> e.g., James "Crufty" Harris...  :-)

That's a bit below the belt. We each have things we have stated we
dislike. We don't want to get back to the law of the playground here,
do we?

> > even though C is low level by design it still doesn't allow the call
> > scheme to be controlled.

> True.  I've chosen C, for the most part.  It can't do everything needed by
> an OS: "some assembly is required".

> > The average compiler implements calls and
> > parameter passing the way it implements them and that's it.

> How many calling conventions and/or parameter passing methods does one need?
> As I see it a system should only need one (or two like typical C...).
> Eventually everything on the system uses that one.

I see your point. My take on this is that where there are better or
faster ways of doing things an OS designer should consider them. I
have absolutely no problem with having multiple ways for apps to
interact with the OS as long as there is clear value in each one. For
example, some may be simpler, others may be faster. In fact, perhaps
the simpler ways can be wrappers around those which are faster. As you
say, they all come back to a common base.

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile   Translate to Translated (View Original)
 More options 3 Nov, 22:12
Newsgroups: alt.os.development
From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
Date: Tue, 3 Nov 2009 17:12:14 -0500
Local: Tues 3 Nov 2009 22:12
Subject: Re: May your privileged stack always be big enough
"James Harris" <james.harri...@googlemail.com> wrote in message

news:68ce4334-9357-4029-9428-d74a1cfe5c00@m26g2000yqb.googlegroups.com...
On 3 Nov, 09:43, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:

> > As a slight aside, Alexei proved
> > to me that bare metal does hide some errors, and that I should
eventually
> > test on some emulators too.

> Was that on a.o.d? I'd like to look in to that some more.

IIRC, there were a couple.

The most specific one I've been able to relocate was in a thread titled "new
FYSOS release".  It was a thread discussing of Ben Lunt's FYSOS.  (The first
message allows pulling up the entire thread, if wanted.):
http://groups.google.com/group/alt.os.development/msg/b332b6f2b287ed8...

IIRC, Ben Lunt had problems with incorrect floppy emulation in older
versions of QEMU which might've been a thread titled "FYSOS under QEMU"
and/or "FYSOS under VPC".  First message:
http://groups.google.com/group/alt.os.development/msg/2f9f65340816dd8...

> I was just talking about making
> an activation record persistent - i.e. to last between one activation
> and another. I think this can have both performance and security
> gains. (Less state to set up and static variables can stay in scope.)

One of the things I've become very fond of with DOS (and Windows 98) is the
ability to start the OS in and from and a clean state.

It's one of things I dislike with newer versions of Windows which seem very
paternalistic to me.  If newer Windows journals something to the filesystem
when I'm shutting down, I can't get rid of it - "the problem" - via a reboot
like older OSes.  The journalning increases the persistence of the problem.
If the OS is experiencing some mystery glitch - which seems to happen with
all versions of Linux and Windows - I can reboot the somes OSes (like DOS)
and the problem goes away.  With newer OSes, the problem seems to reappear
due to persistent state, e.g., from journaling, or hibernation and sleep
modes.

> > Y'know, it's claims like that lead to nickames that one hates... Right?
> > e.g., James "Crufty" Harris... :-)

> That's a bit below the belt. We each have things we have stated we
> dislike. We don't want to get back to the law of the playground here, do

we?

Sorry!  That wasn't meant in a hostile way, but a humourous one.  I was just
pointing out your persistent activation record, er.. use of "cruft".  ;-)

Rod Pemberton


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "nanokernel and persistent activation record" by Rod Pemberton
Rod Pemberton  
View profile   Translate to Translated (View Original)
 More options 3 Nov, 23:37
Newsgroups: alt.os.development
From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
Date: Tue, 3 Nov 2009 18:37:34 -0500
Local: Tues 3 Nov 2009 23:37
Subject: Re: nanokernel and persistent activation record

"Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote in message

news:hcou43$p7i$1@aioe.org...

> "James Harris" <james.harri...@googlemail.com> wrote in message
> news:6761173d-4f2b-43e2-807f-abd40d8c9c9d@37g2000yqm.googlegroups.com...

> > That said, in protected mode
> > we had an effective DS register that is no longer there. Instead, for
> > OS code at least we are encouraged to use GS to locate local data.

> A good reason to at least consider nanokernels...

The first part of quote on QNX from Wikipedia interests me, but the last
sentence might interest you.  It seems to be one method for implementing a
"persistent activation record".

"The QNX kernel contains only CPU scheduling, interprocess communication,
interrupt redirection and timers. Everything else runs as a user process,
including a special process known as proc which performs process creation,
and memory management by operating in conjunction with the microkernel.
This is made possible by two key mechanisms - subroutine-call type
interprocess communication, and a boot loader which can load an image
containing not only the kernel but any desired collection of user programs
and shared libraries."

****
Someone a while back (James?  Ben?) asked what was needed in an OS.  IIRC, I
answered in terms of hardware and hardware interfaces, such as USB,
harddisk, etc.  However, from Wikipedia, it's easy to summarize what is
needed for a microkernel:

A minimal microkernel
- address spaces mechanisms
- cpu scheduling
- IPC

QNX microkernel
- cpu scheduling
- IPC
- interrupt redirection
- timers

L4 microkernel
- address spaces
- threads
- cpu scheduling
- IPC (very low cost)
- low cache footprint

Rod Pemberton


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
derrick  
View profile   Translate to Translated (View Original)
 More options 4 Nov, 02:01
Newsgroups: alt.os.development
From: derrick <derrick.ke...@gmail.com>
Date: Tue, 3 Nov 2009 18:01:47 -0800 (PST)
Local: Wed 4 Nov 2009 02:01
Subject: Re: nanokernel and persistent activation record
On Nov 3, 6:37 pm, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:

Rod everything you say about an l4 kernel also applies to QNX
Neutrino

    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "May your privileged stack always be big enough" by James Harris
James Harris  
View profile   Translate to Translated (View Original)
 More options 4 Nov, 13:14
Newsgroups: alt.os.development
From: James Harris <james.harri...@googlemail.com>
Date: Wed, 4 Nov 2009 05:14:58 -0800 (PST)
Local: Wed 4 Nov 2009 13:14
Subject: Re: May your privileged stack always be big enough
On 3 Nov, 22:12, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:

Thanks. I'll take a look at them.

James


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
BGB / cr88192  
View profile   Translate to Translated (View Original)
 More options 11 Nov, 22:07
Newsgroups: alt.os.development
From: "BGB / cr88192" <cr88...@hotmail.com>
Date: Wed, 11 Nov 2009 15:07:01 -0700
Local: Wed 11 Nov 2009 22:07
Subject: Re: May your privileged stack always be big enough

"James Harris" <james.harri...@googlemail.com> wrote in message

news:72fd8b50-eaad-4764-8ff4-9ad64b530d5c@w19g2000yqk.googlegroups.com...

>I normally list the key points of a post in the subject heading but in
> this case there are just too many.... The post is about detecting
> application stack overflow and underflow and, in particular,
> protecting and sizing the privileged stack in 32-bit and 64-bit modes.

> I'd appreciate your thoughts, suggestions and corrections.

> I'm looking at the base Intel and AMD 64-bit architecture (which I'll
> call x86-64 herein) with a view to it influencing my 32-bit code. Why?
> Well, it seems sensible to design 32-bit operations which don't
> require too many changes to port to 64-bit later. I've not looked at
> 64-bit working before. It is quite different, isn't it!

the main thing for 32/64 bit compatibility (at the C level) is writing
fairly generic code...

> 1) In x86-64 the stack segment has base = 0 and limit = none as do
> code and data segments. So it's not even an option to detect stack
> overflow (a request for stack expansion) or underflow (trying to
> remove more than the stack holds) by reference to the stack segment.
> The only option I can think of is to have guard page frames above and
> below every application (non-privileged) stack. These would be marked
> not-present. Is this the best way to detect application stack overflow
> and underflow?

yes, the flat model is the only real option for x86-64...

as for not-present pages around the stack, this is a common practice at
least...
another practice is to not bother and assume the stack is big enough, but
this is more of a lazy option...

> 2) The privileged stack is a critical resource, isn't it? AFAICS it
> must always have present memory to write to. If, in a page fault,
> there is not enough stack space we'll get a double fault. And because
> double faults are not restartable there is no apparent means of
> recovery. So how is it best to provide privileged stack space? Should
> its size be checked at the top or bottom of some or all service
> routines, or can all service routines be written to unwind it before
> returning to user mode? It seems so but it would be good to hear what
> you guys have done or are thinking of.

my comment: treat the stack as one normally treats the stack, as in, always
fully unwind before returning.

granted, there are several different ways to handle the Ring3->Ring0
transition.
I forget the details (my OS dev work was a long time ago), but I had
remembered that the initial transition to ring-0 had took place while using
the ring-3 stack, which was used to save off the ring-3 state, and at this
point a bit of magic was used to swap the stack (and maybe fully transition
to ring-0), and transfer control back into the C code in the kernel.

from what I remember, this mechanism was also used in thread and process
switching.
in this way, the state of a thread was always saved at the bottom of the
stack.

> 3) If the privileged stack must always be large enough how much space
> should be set aside? If it is only used to service interrupts and
> syscalls it probably doesn't need to be very big. A 4k page seems much
> too large. The bulk of the state can be saved in a thread image if
> desirable.

errm, you will usually end up doing stuff in the kernel (as in, calling
through piles of C code, ...), so one will need a little more stack than
this...

in my OS, from what I remember, I used 32kB for the ring-0 stack.
actually, from what I remember, I set up the stack in the boot loader to
point just below the boot loader, and then proceeded to load the
second-stage loader to 0x8000, which used the same stack.
then I loaded the kernel (I think between 0x10000 and 0x9FFFF or
similar...), and continued using the same stack after kernel startup.

(this may have changed later on, as I think I am remembering something about
having used RLEW compression on the kernel, but am not sure...).

memory above 1M was then generally used for kernel heap and for application
code/data.

so, the stack was 32kB mostly because the stack top was at 0x7C00 or
similar.

similarly, the second stage remained in memory after kernel startup, itself
mostly serving as a place for holding the GDT and IDT, and also any realmode
components of the kernel (where some drivers, such as for VESA, had worked
by essentially jumping back and forth between realmode and protected mode,
...).

from what I remember, the kernel also generally operated with a (mostly)
plain raw address space view of the world (except I also remembering using
some trick where the page table included itself as an index, such that the
whole page table looked like a flat 4MB buffer regardless of which pages
were used to build its structure).

but, alas, it is difficult to remember the specifics after what is around
7-8 years now...


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google