Pointer arithmetic in C#

aggle-rithm

Ardent Formulist
Joined
Jun 9, 2005
Messages
15,334
Location
Austin, TX
I am working on a file parsing system in C#, and I want to share an array of bytes between objects, with each object only looking at a subset of the array. There is some overlap, as it breaks the file down into hierarchical structures, some of which contain subsets of the data within "child" objects, as well as the complete set.

In C++, this was fairly simple to do, since all I had to do was add the offset to the original pointer, and there was my new byte array. In C# it's more difficult because the array is an encapsulated object.

I've already started to work with one solution that wraps the array object in a class that allows subsets to be easily referred to, but I'm wondering if I'm missing an easier solution.
 
I'm not so familiar with C#, but I think you should be able to use iterators to your array class in pretty much the same way you would use pointers in C++.

Dr. Stupid
 
So you want to be able to pass 'slices' of the array around, such that any modification to the slice also affects the whole array?

Don't fight the object-based approach. That will be the 'easier' solution in C#. It's easier than faking pointer arithmetic, at any rate.
 
So you want to be able to pass 'slices' of the array around, such that any modification to the slice also affects the whole array?

Don't fight the object-based approach. That will be the 'easier' solution in C#. It's easier than faking pointer arithmetic, at any rate.


The original purpose of this system, although I might add to it in the future, is to go into an Excel file and grab just the worksheet names. Excel files have a logical format (BIFF) built onto a physical format (compound document file format), so that the logical slices can be subsets of the physical slices. For the sake of data integrity, I really didn't want to split the same data up into multiple buffers. However, memory is cheap, so I may end up taking that approach after all.

Thanks for everyone's input.
 
Unless performance is critical, you can easily get names of the Worksheets in a Workbook by doing one of the following:

1. Add a reference to the Excel object library in C# and simply loop through each worksheet in each file (workbook) that you want.
2. Do basically the same as in option 1 but use VBA in Excel instead.
 
Don't fight the object-based approach. That will be the 'easier' solution in C#. It's easier than faking pointer arithmetic, at any rate.
I would agree, here. You don't need to use pointers in C#, except in the most special of cases. (I wish I could remember some good examples of such cases.)

In Memory tables are great. You can use SQL queries to look at the subset you want.
Yeah, but performance is a drag. Cost/Benefit is not worthwhile vs. arrays of files and stuff, for this purpose.
That is an idea worth considering if you have lots of data, and are performing relatively complex analysis of it- you can let SQL do most of the hard work.
 
Unless performance is critical, you can easily get names of the Worksheets in a Workbook by doing one of the following:

1. Add a reference to the Excel object library in C# and simply loop through each worksheet in each file (workbook) that you want.

I agree 100%. I've used this technique before, and it works quite well, even if the documentation can be somewhat hard to find...
 
Unless performance is critical, you can easily get names of the Worksheets in a Workbook by doing one of the following:

1. Add a reference to the Excel object library in C# and simply loop through each worksheet in each file (workbook) that you want.
2. Do basically the same as in option 1 but use VBA in Excel instead.

That brings me to my original reason for doing all this: Excel, used as an object, is extremely unstable in multi-threaded applications. Even if it's a multi-threaded application in which the instance of Excel is used ONLY IN THE MAIN THREAD, it will crash consistently.

Unfortunately, we have engineers where I work that insist on using Excel as a database system, even though they have Access and SQL Server. The worksheets are enormously complex and there are THOUSANDS of them that have to be accessed each day. Even in a single-threaded application, there are numerous times when the Excel object locks up and another needs to be started. We end up with fifty instances of Excel running in the background.

Accessing them through OLEDB is much more stable, but you have to know the sheet names, and we can't count on people always observing the naming conventions. Hence, the sheet-name-reading code.
 
Just create an "ExcelFileInfo" class which stores the byte array and then have nested classes (or structures if the info contains no reference types and will not be boxed) which deal with the breakdown of the array into smaller units.

That, or one of the billion other reasonable ways of compartmentalizing the array. Just do it in a way that completely hides the inner data structure from the rest of the proggy.

You're life will be much easier in the long run.

If you feel that you will be instantiating and destroying too many objects because of the numbers of calls, you can always use resurrection and an object pool to lower the load on the garbage collector.

Do NOT access excel worksheets from a multi-threaded application using COM as, not only are excel objects designed to be run in STA mode, they require specific user rights and the ability to interact with the desktop. Microsoft does not support this in any way, shape, or form in the current versions of office..... =o/
 
That brings me to my original reason for doing all this: Excel, used as an object, is extremely unstable in multi-threaded applications. Even if it's a multi-threaded application in which the instance of Excel is used ONLY IN THE MAIN THREAD, it will crash consistently.

Unfortunately, we have engineers where I work that insist on using Excel as a database system, even though they have Access and SQL Server. The worksheets are enormously complex and there are THOUSANDS of them that have to be accessed each day. Even in a single-threaded application, there are numerous times when the Excel object locks up and another needs to be started. We end up with fifty instances of Excel running in the background.

Accessing them through OLEDB is much more stable, but you have to know the sheet names, and we can't count on people always observing the naming conventions. Hence, the sheet-name-reading code.

Using Excel as a database is the work of the devil. They can't do something stupid, and expect you to unstupid it.
 
Using Excel as a database is the work of the devil. They can't do something stupid, and expect you to unstupid it.

I'm totally going to steal that line...

I am going to have to disagree with you though. Of course they can do something stupid and expect someone else to unstupid it. They shouldn't, but they can, and frequently will. As a developer, I feel it is my job to smack them upside the head, and then give them a better solution. People won't respond to arguments the are basically "but you are doing it wrong".

Present to them why something is wrong, the problems found when doing it wrong, what should be done to correct it, how it can be implemented, and what the advantages are to doing it correctly. If they see the light and decide to fix it, perfect. If not, start looking for another job, because a company that willfully continues to do something the wrong way will only end up costing themselves time and money, and stress the hell out of you...
 
I'm totally going to steal that line...

I am going to have to disagree with you though. Of course they can do something stupid and expect someone else to unstupid it. They shouldn't, but they can, and frequently will. As a developer, I feel it is my job to smack them upside the head, and then give them a better solution. People won't respond to arguments the are basically "but you are doing it wrong".

Present to them why something is wrong, the problems found when doing it wrong, what should be done to correct it, how it can be implemented, and what the advantages are to doing it correctly. If they see the light and decide to fix it, perfect. If not, start looking for another job, because a company that willfully continues to do something the wrong way will only end up costing themselves time and money, and stress the hell out of you...

And I intend to steal the above two paragraphs from you!
 

Back
Top Bottom